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Preface 



We are pleased to introduce the third edition of the book, and we are thankful to 
those who used the book in research and practice and to those who sent us com- 
ments and feedback. As before, our objective is to present, in an easily accessible 
manner, logistics and supply chain models, algorithms, and tools. In this edition, 
we have attempted to build on the positive elements of the first two editions and 
to include what we have learned in the last few years, since the publication of the 
second edition. 

In the last two decades, the academic community has focused on addressing 
many supply chain challenges. In some cases, the focus is on characterizing the 
structure of the optimal policy and identifying algorithms that generate the best 
possible policies. When this is not possible, the focus has been on an approach 
whose purpose is to ascertain characteristics of the problem or of an algorithm 
that are independent of the specific problem data. That is, the approach deter- 
mines characteristics of the solution or the solution method that are intrinsic to 
the problem and not the data. This approach includes the so-called worst-case and 
average-case analyses, which, as illustrated in the book, help not only to under- 
stand characteristics of the problem or solution methodology, but also to provide 
specific guarantees of effectiveness. In many cases, the insights obtained from these 
analyses can then be used to develop practical and effective algorithms for specific 
complex logistics problems. Finally, game-theoretic approaches have been applied 
in the last few years to provide more insights to supply chain models involving 
competition and collaboration. 

We have made several important changes to the third edition of this text. Many 
of these changes have been a result of new research or consulting engagement we 
have completed in the last few years. Our major changes include 
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• a new chapter on game theory, where we introduce the reader to key concepts 
and techniques (Chap. 3), 

• a new chapter on supply chain competition and collaboration, where we 
extensively apply game theory (Chap. 11), 

• a new chapter on process flexibility, where we explain the power of limited 
degree of flexibility (Chap. 13), 

• a new section (Sect. 2.3) on discrete convex analysis, 

• a new section (Sect. 9.6) on stochastic inventory models with positive lead 
times. 

Additionally, we have extended the materials on integrated inventory and pricing 
models, including three new sections on the economic lot sizing model with pricing 
(Sect. 8.4), demand models (Sect. 10.2), and an alternative approach for deriving 
the structure of optimal policies (Sect. 10.5). 

As before, this book is written for graduate students, researchers, and practi- 
tioners interested in the mathematics of logistics and supply chain management. 
We assume the reader is familiar with the basics of linear programming and prob- 
ability theory and, in a number of sections, complexity theory and graph theory, 
although in many cases these can be skipped without loss of continuity. 

Parts of this book are based on work we have done either together or with 
others. Indeed, some of the chapters originated from papers we have published in 
journals such as Mathematics of Operations Research , Mathematical Programming , 
Operations Research , and HE Transactions. We rewrote most of these, trying to 
present the results in a simple yet general and unified way. However, a number of 
key results, proofs, and discussions are reprinted without substantial change. Of 
course, in each case this was done by providing the appropriate reference and by 
obtaining permission of the copyright owner. In the case of Operations Research 
and Mathematics of Operations Research , it is the Institute for Operations Re- 
search and the Management Sciences (INFORMS). Chapter 13 is heavily based 
on the paper by David Simchi-Levi and Yehua Wei, “Understanding the Perfor- 
mance of the Long Chain and Sparse Designs in Process Flexibility,” which was 
published in Operations Research in 2012. Chapter 14 borrows extensively from 
“Supply Chain Design and Planning — Applications of Optimization Techniques 
for Strategic and Tactical Models,” written by Ana Muriel and David Simchi-Levi 
and published in the Handbooks in Operations Research and Management Science , 
the volume on Supply Chain Management , S. Graves and A. G. Kok, eds., North- 
Holland, Amsterdam. Similarly, Chap. 20 borrows extensively from Designing and 
Managing the Supply Chain , written by David Simchi-Levi, Philip Kaminsky, and 
Edith Simchi-Levi and published by McGraw-Hill in 2007. 
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1 

Introduction 



1.1 What Is Logistics Management? 

For many companies, the ability to efficiently match demand and supply is key to 
their success. Failure to do so could lead to loss of revenue, reduced service levels, 
impacted reputation, and decline in the company’s market share. Unfortunately, 
recent developments such as intense market competition, product proliferation, 
and the increase in the number of products with a short life cycle have created 
an environment where customer demand is volatile and unpredictable. In such 
an environment, traditional operations strategies such as building inventory, in- 
vesting in capacity buffers, or increasing committed response time to consumers 
do not offer a competitive advantage. Therefore, many companies are looking for 
effective strategies to respond to market changes without significantly increasing 
cost, inventory, or response time. This has motivated a continuous evolution of the 
management of logistics systems. 

In these systems, items are produced at one or more factories, shipped to ware- 
houses and distribution centers for intermediate storage, and then shipped to 
retailers or customers. Consequently, to reduce cost and improve service levels, 
logistics strategies must take into account the interactions of these various levels 
in this logistics network , also referred to as the supply chain. This network consists 
of suppliers, manufacturing centers, warehouses, distribution centers, and retailer 
outlets, as well as raw materials, work-in-process inventory, and finished products 
that flow between the facilities; see Fig. 1.1. 
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Manufacturing Plants 
and suppliers 



Warehouses 



Retail Stores 
and Customers 




FIGURE 1.1. The logistics network 



The goal of this book is to present the state-of-the-art in the science of logistics 
management. But what exactly is logistics management ? According to the Council 
of Supply Chain Management Professionals (CSCMP), a nonprofit organization of 
business personnel, it is that part of the business that 

plans, implements, and controls the efficient, effective forward and re- 
verses flow and storage of goods, services and related information be- 
tween the point of origin and the point of consumption in order to meet 
customers’ requirements. 



This definition leads to several observations. First, logistics management takes 
into consideration every facility that has an impact on cost and plays a role in 
making the product conform to customer requirements: from supplier and manu- 
facturing facilities, through warehouses and distribution centers, to retailers and 
stores. Indeed, in some supply chain analyses, it is necessary to account for the 
suppliers’ suppliers and the customers’ customers because they have an impact on 
supply chain performance. 

Second, the objective of logistics management is to be efficient and cost-effective 
across the entire system; total systemwide costs, from transportation and distri- 
bution to inventories of raw materials, work- in-process, and finished goods, are 
to be minimized. Thus, the emphasis is not on simply minimizing transportation 
cost or reducing inventories but, rather, on taking a systems approach to logistics 
management. 
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Finally, because logistics management evolves around planning , implementing, 
and controlling the logistics network, it encompasses many of the firm’s activities, 
from the strategic level through the tactical to the operational level. 

Following Hax and Candea’s (1984) treatment of production-inventory systems, 
logistical decisions are typically classified into three levels. 

• The strategic level deals with decisions that have a long-lasting effect on 
the firm. This includes decisions regarding the number, location, and capaci- 
ties of warehouses and manufacturing plants, or the flow of material through 
the logistics network. 

• The tactical level typically includes decisions that are updated anywhere 
between once every week, month, or quarter. This includes purchasing and 
production decisions, inventory policies, and transportation strategies, in- 
cluding the frequency with which customers are visited. 

• The operational level refers to day-to-day decisions such as scheduling, 
routing, and loading trucks. 

Finally, what about supply chain management? What is the difference between 
supply chain management and logistics management? It is insightful to review the 
CSCMP definition. According to the CSCMP, 

Supply chain management encompasses the planning and management 
of all activities involved in sourcing and procurement, conversion, and 
all logistics management activities. 

Thus, according to this definition, supply chain management includes logistics 
management as well as the coordination and collaboration with business partners 
such as suppliers, third-party service providers, and customers. In this book, we will 
focus on models important to both logistics as well as supply chain management. 



1.2 Managing Cost and Uncertainty 

What makes logistics, or supply chain management, difficult? Although we will 
discuss a variety of challenges throughout this text, they can all be related to one 
or both of the following observations: 

1. It is challenging to design and operate a logistics system so that systemwide 
costs are minimized and systemwide service levels are maintained. Indeed, it 
is frequently difficult to operate a single facility so that costs are minimized 
and the service level is maintained. The difficulty increases significantly when 
an entire system is being considered. 



2. Uncertainty is inherent in every logistics network; customer demand can 
never be forecast exactly, travel times will never be certain, and machines 
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and vehicles will break down. Logistics networks need to be designed to 
eliminate as much uncertainty as possible and to deal effectively with the 
uncertainty that remains. 

One reason it is difficult to manage cost and uncertainty is due to supply chain 
dynamics. Indeed, in recent years, many suppliers and retailers have observed that 
while customer demand for specific products does not vary much, inventory and 
back-order levels fluctuate considerably across their supply chain. For instance, 
in examining the demand for Pampers disposal diapers, executives at Procter & 
Gamble noticed an interesting phenomenon. 

As expected, retail sales of the product were fairly uniform; there is no particular 
day or month in which the demand is significantly higher or lower than in any other. 
However, the executives noticed that distributors’ orders placed to the factory 
fluctuated much more than retail sales. In addition, P&G’s orders to its suppliers 
fluctuated even more. This increase in variability as we travel up in the supply 
chain is referred to as the bullwhip effect. 

Even when demand is known precisely (e.g., because of contractual agreements), 
the planning process needs to account for demand and cost parameters varying 
over time due to the impact of seasonal fluctuations, trends, advertising and pro- 
motions, competitors’ pricing strategies, and so forth. These time- varying demand 
and cost parameters make it difficult to determine the most effective supply chain 
strategy, that is, the one that minimizes systemwide costs and conforms to cus- 
tomer requirements. 



1.3 Examples 

In this section, we introduce some of the logistics management issues that form 
the basis of the problems studied in the first four parts of the book. These issues 
span a large spectrum of logistics management decisions, at each of the three levels 
mentioned above. Our objective here is to briefly introduce the questions and the 
tradeoffs associated with these decisions. 

Network Configuration 

Consider the situation where several plants are producing products to serve a set 
of geographically dispersed retailers. The current set of facilities, that is, plants 
and warehouses, is deemed to be inappropriate, and management wants to re- 
organize or redesign the distribution network. This may be due, for example, to 
changing demand patterns or the termination of a leasing contract for a num- 
ber of existing warehouses. In addition, changing demand patterns may entail a 
change in plant production levels, a selection of new suppliers, and, in general, 
a new flow pattern of goods throughout the distribution network. The goal is to 
choose a set of facility locations and capacities, to determine production levels 
for each product at each plant, and to set transportation flows between facilities, 
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either from plant to warehouse or from warehouse to retailer, in such a way that 
total production, inventory, and transportation costs are minimized and various 
service-level requirements are satisfied. 

Production Planning 

A manufacturing facility must produce to meet demand for a product over a fixed 
finite horizon. In many real-world cases, it is appropriate to assume that demand is 
known over the horizon. This is possible, for example, if orders have been placed in 
advance or contracts have been signed specifying deliveries for the next few months. 
Production costs consist of a fixed amount, corresponding, say, to machine setup 
costs or times, and a variable amount, corresponding to the time it takes to produce 
one unit. A holding cost is incurred for each unit in inventory. The planner’s 
objective is to satisfy demand for the product in each period and to minimize 
the total production and inventory costs over the fixed horizon. Obviously, this 
problem becomes more difficult as the number of products manufactured increases. 

Inventory Control and Pricing Optimization 

Consider a retailer that maintains an inventory of a particular product. Since cus- 
tomer demand is random, the retailer has information regarding the probabilistic 
distribution of demand only. The retailer’s objective is to decide at what point 
to reorder a new batch of products, and how much to order. Typically, ordering 
costs consist of two parts: a fixed amount, independent of the size of the order, for 
example, the cost of sending a vehicle from the warehouse to the retailer; and a 
variable amount dependent on the number of products ordered. A linear inventory 
holding cost is incurred at a constant rate per unit of product per unit of time. 
The retailer must determine an optimal inventory policy to minimize the expected 
cost of ordering and holding inventory. In some situations, the price at which the 
product is sold to the end customer is also a decision variable. In this case, demand 
is not only random but is also affected by the selling price. The retailer’s objective 
is thus to find an inventory policy and a pricing strategy maximizing expected 
profit over the finite, or infinite, horizon. 

Procurement Strategies and Supply Contracts 

In traditional logistics strategies, each party in the network focuses on its own profit 
and hence makes decisions with little regard to their impact on other partners. 
Relationships between suppliers and buyers are established by means of supply 
contracts that specify pricing and volume discounts, delivery lead times, qual- 
ity, returns, and so forth. The question, of course, is whether supply contracts 
can also be used to replace the traditional strategy with one that optimizes the 
performance of the entire network. In particular, what is the impact of volume 
discount and revenue- sharing contracts on supply chain performance? Are there 
pricing strategies that can be applied by suppliers to incentivize buyers to order 



6 



1 . Introduction 



more products while at the same time increasing the supplier’s profit? What are 
the risks associated with supply contracts, and how can these risks be minimized? 

Process Flexibility 

In the last few years, companies have been looking for new ways to respond to 
change in demand volume and mix without increasing inventory, capacity, or re- 
sponse time. One possible way to achieve that objective is to invest in process 
flexibility, where each plant is capable of producing multiple products. In this 
case, when the demand for one product is higher than expected while the demand 
for a different product is lower than expected, a flexible manufacturing system can 
quickly make adjustments by shifting production capacities appropriately. Unfor- 
tunately, flexibility does not come free, and hence the questions are how much 
flexibility is needed, how can one achieve flexibility, and what are the potential 
benefits of a (small) investment in flexibility? 

Integration of Production, Inventory, and Transportation 
Decisions 

Consider the problem faced by companies that rely on LTL (less than truckload) 
carriers for the distribution of products across their supply chain. Typically, these 
carriers offer volume discounts to encourage larger shipments; as a result, the 
transportation charges borne by the shipper are often piecewise linear and concave. 
In this case, the timing and routing of shipments need to be coordinated so as to 
minimize systemwide costs, including production, inventory, transportation, and 
shortage costs, by taking advantage of economies of scale offered by the carriers. 

Vehicle Fleet Management 

A warehouse supplies products to a set of retailers using a fleet of vehicles of limited 
capacity. A dispatcher is in charge of assigning loads to vehicles and determining 
vehicle routes. First, the dispatcher must decide how to partition the retailers 
into groups that can be feasibly served by a vehicle, that is, whose loads fit in a 
vehicle. Second, the dispatcher must decide what sequence to use so as to minimize 
cost. Typically, one of two cost functions is possible: In the first, the objective is to 
minimize the number of vehicles used, while in the second, the focus is on reducing 
the total distance traveled. The latter is an example of a single-depot capacitated 
vehicle routing rroblem (CVRP), where a set of customers has to be served by a 
fleet of vehicles of limited capacity. The vehicles are initially located at a depot 
(in this case, the warehouse) and the objective is to find a set of vehicle routes of 
minimal total length. 

Truck Routing 

Consider a truck that leaves a warehouse to deliver products to a set of retailers. 
The order in which the retailers are visited will determine how long the delivery 
will take and at what time the vehicle can return to the warehouse. Therefore, it 



1.4 Modeling Logistics Problems 7 



is important that the vehicle follow an efficient route. The problem of finding the 
minimal length route, in either time or distance, from a warehouse through a set 
of retailers is an example of a traveling salesman problem (TSP). Clearly, truck 
routing is a subproblem of the fleet management example above. 

Packing Problems 

In many logistics applications, a collection of items must be packed into boxes, 
bins, or vehicles of limited size. The objective is to pack the items such that 
the number of bins used is as small as possible. This problem is referred to as 
the bin-packing problem (BPP). For example, it appears as a special case of the 
CVRP when the objective is to minimize the number of vehicles used to deliver the 
products. Bin-packing also appears in many other applications, including cutting 
standard- length wire or paper strips into specific customer order sizes. It also often 
appears as a subproblem in other combinatorial problems. 



1.4 Modeling Logistics Problems 

The reader observes that most of the problems and issues described in the previ- 
ous section are fairly well defined mathematically. These are the types of issues, 
questions, and problems addressed in this book. Of course, many issues important 
to logistics or supply chain management are difficult to quantify and therefore to 
address mathematically; we will not cover these in this book. This includes topics 
related to information systems, outsourcing, third-party logistics, strategic part- 
nering, etc. For a detailed analysis of these topics, we refer the reader to the book 
by Simchi-Levi et al. (2007) or the more recent book by Simchi-Levi (2010). 

The fact that the examples provided in the previous section can be defined 
mathematically is, obviously, meaningless unless all required data are available. 
As we discuss in Part V of this book, finding, verifying, and tabulating the data 
are typically very problematic. Indeed, inventory holding costs, production costs, 
extra vehicle costs, and warehouse capacities are often difficult to determine. Fur- 
thermore, identifying the data relevant to a particular logistics or supply chain 
problem adds another layer of complexity to the data-gathering problem. Even 
when the data do exist, there are other difficulties related to modeling complex 
real-world problems. For example, in our analysis we ignore issues such as varia- 
tions in travel times, variable yields in production, inventory shrinkage, forecast- 
ing, crew scheduling, and so on. These issues complicate logistics and supply chain 
practice considerably. 

For most of this book, we assume that all relevant data, for example, production 
costs, production times, warehouse fixed costs, travel times, and holding costs, are 
given. As a result, each logistics or supply chain problem analyzed in Parts I-IV 
is well defined and thus merely a mathematical problem. 
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1.5 Logistics and Supply Chain in Practice 

How are logistics and supply chain problems addressed in practice? That is, how 
are these difficult problems solved in the real world? In our experience, companies 
use several approaches. First and foremost, as in other aspects of life, people tend 
to repeat what has worked in the past. That is, if last year’s safety stock level 
was enough to avoid backlogging demands, then the same level might be used this 
year. If last year’s delivery routes were successful, that is, all retailers received 
their deliveries on time, then why change them? Second, there are so-called rules 
of thumb that are widely used and, at least on the surface, may be quite effec- 
tive. For example, it is our experience that many logistics managers often use the 
“20/80 rule,” which says that about 20 % of the products contribute to about 80 % 
of the total cost and therefore it is sufficient to concentrate efforts on these critical 
products. Logistics network design, to give another example, is an area where a 
variety of rules of thumb are used. One such rule might suggest that if your com- 
pany serves the continental United States and it needs only one warehouse, then 
this warehouse should probably be located in the Chicago area; if two are required, 
then one in Los Angeles and one in Atlanta should suffice. Finally, some companies 
try to apply the experience and intuition of logistics experts and consultants, the 
idea being that what has worked well for a competitor should work well for itself. 

Of course, while all these approaches are appealing and quite often result in 
logistics strategies that make sense, it is not clear how much is lost by not foc- 
using on the best (or close to the best) strategy for the particular case at hand. 
Indeed, recently, with the advent of cheap computing power, it has become increas- 
ingly affordable for many firms, not just large ones, to acquire and use sophisti- 
cated advance planning systems (APS) to optimize their logistics and supply chain 
strategies. In these systems, data are entered, reviewed, and validated, various alg- 
orithms are executed, and a suggested solution is presented in a user-friendly way. 
Provided the data are correct and the system is solving the appropriate problem, 
these APS can substantially reduce systemwide cost. Also, generating a satisfac- 
tory solution that managers can implement is typically only arrived at after an 
iterative process in which the user evaluates various scenarios and assesses their 
impact on costs and service levels. Although this may not exactly be considered 
“optimization” in a strict sense, it usually serves as a useful tool for the system’s 
user. 

These planning systems have as their nucleus models and algorithms in some 
form or another. In some cases, the system may simply be a computerized version 
of the rules of thumb above. In more and more instances, however, these systems 
apply techniques that have been developed by the operations, management science, 
and computer science research communities. 

In this book, we present the current state-of-the-art in mathematical research in 
the area of logistics. Some of the problems listed above represent difficult stochas- 
tic optimization problems that require concepts such as convexity and super- 
modularity, and their extensions for their analysis. Some problems require the 
use of methods from game theory in order to understand how different supply 
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chain partners respond to various challenges. Other problems have at their core 
extremely difficult combinatorial problems in the class called A/T^-Hard problems. 
This implies that it is very unlikely that one can construct an algorithm that will 
always find the optimal solution, or the best possible decision, in computation time 
that is polynomial in the “size” of the problem. The interested reader can refer 
to the excellent book by Garey and Johnson (1979) for details on computational 
complexity. Therefore, in many cases, an algorithm that consistently provides the 
optimal solution is not considered a reachable goal, and hence heuristic, or ap- 
proximation, methods are employed. 



1.6 Evaluation of Solution Techniques 

A fundamental research question is how to evaluate heuristic or approximation 
methods. Such methods can range from simple “rules of thumb” to complex, com- 
putationally intensive, mathematical programming techniques. In general, these 
are methods that will find good solutions to the problem in a reasonable amount 
of time. Of course, the terms “good” and “reasonable” depend on the heuristic and 
on the problem instance. Also, what constitutes reasonable time may be highly 
dependent on the environment in which the heuristic will be used; that is, it de- 
pends on whether or not the algorithm needs to solve the logistics problem in real 
time. 

Assessing and quantifying a heuristic’s effectiveness is of prime concern. 

Traditionally, the following methods have been employed. 

• Empirical comparisons: Here, a representative sample of problems is cho- 
sen and the performance of a variety of heuristics is compared. The compar- 
ison can be based on solution quality or computation time, or a combination 
of the two. This approach has one obvious drawback: deciding on a good set 
of test problems. The difficulty, of course, is that a heuristic may perform 
well on one set of problems but may perform poorly on the next. As pointed 
out by Fisher (1995), this lack of robustness forces practitioners to “patch 
up” the heuristic to fix the troublesome cases, leading to an algorithm with 
growing complexity. After considerable effort, a procedure may be created 
that works well for the situation at hand. Unfortunately, the resulting algo- 
rithm is usually extremely sensitive to changes in the data and may perform 
poorly when transported to other environments. 

• Worst-case analysis: In this type of analysis, one tries to determine the 
maximum deviation from optimality, in terms of relative error, that a heuris- 
tic can incur on any problem instance. For example, a heuristic for the BPP 
might guarantee that any solution constructed by the heuristic uses at most 
50 % more bins than the optimal solution. Or a heuristic for the TSP might 
guarantee that the length of the route provided by the heuristic is at most 
twice the length of the optimal route. Using a heuristic with such a guarantee 
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allays some of the fears of suboptimality, by guaranteeing that we are within 
a certain percentage of optimality. Of course, one of the main drawbacks of 
this approach is that a heuristic may perform very well on most instances that 
are likely to appear in a real-world application but may perform extremely 
poorly on some highly contrived instances. Hence, when we compare algo- 
rithms, it is not clear that a heuristic with a better worst-case performance 
guarantee is necessarily more effective in practice. 

• Average-case analysis: Here, the purpose is to determine a heuristic’s av- 
erage performance. This is stated as the average relative error between the 
heuristic solution and the optimal solution under specific assumptions on the 
distribution of the problem data. This may include probabilistic assumptions 
on the depot location, demand size, item size, time windows, vehicle capaci- 
ties, etc. As we shall see, while these probabilistic assumptions may be quite 
general, this approach also has its drawbacks. The most important includes 
the fact that an average-case analysis is usually only possible for large-size 
problems. For example, in the BPP, if the item sizes are uniformly distributed 
(between zero and the bin capacity), then a heuristic that will be “close to 
optimal” is one that first sorts the items in nonincreasing order and then, 
starting with the largest item, pairs each item with the largest item with 
which it fits. In what sense is it close to optimal? The analysis shows that 
as the problem size increases (the number of items increases), the relative 
error between the solution created by the heuristic and the optimal solution 
decreases to zero. Another drawback is that in order for an average-case 
analysis to be tractable, it is sometimes necessary to assume independent 
customer behavior. Finally, determining what probabilistic assumptions are 
appropriate in a particular real-world environment is not a trivial problem. 

Because of the advantages and potential drawbacks of each of the approaches, 
we agree with Fisher (1980) that these should be treated as complementary ap- 
proaches rather than competing ones. Indeed, it is our experience that the logistics 
algorithms that are most successfully applied in practice are those with good per- 
formance in at least two of the above measures. 

We should also point out that characterizing the worst-case or average-case 
performance of a heuristic may be technically very difficult. Therefore, a heuristic 
may perform very well on average, or in the worst case, but proving this fact may 
be beyond our current abilities. 



1.7 Additional Topics 

We emphasize that due to space and time considerations, we have been obliged 
to omit some important and interesting results. These include results regarding 
yield management, machine scheduling, random yield in production, and dynamic 
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and stochastic fleet management models, among others. We refer the reader to 
Graves et al. (1993), Ball et al. (1995), De Kok and Graves (2003), Simchi-Levi 
et al. (2004), and Ozer and Phillips (2012) for excellent surveys of these and other 
related topics. 

Also, while many elegant and strong results concerning approaches to certain 
logistics problems exist, there are still many areas where little, if anything, is 
known. This is, of course, partly due to the fact that as the models become more 
complex and integrate more and more issues that arise in practice, their analysis 
becomes more difficult. 

Finally, we remark that it is our firmly held belief that logistics and supply chain 
management are one of the areas in which a rigorous mathematical analysis yields 
not only elegant results but, even more importantly, has had, and will continue to 
have, a significant impact on the practice of logistics and supply chains. 



1.8 Book Overview 

This book is meant as a survey of a variety of results covering most topics in 
the area of logistics. The reader should have a basic understanding of complexity 
theory, linear programming, probability theory, and graph theory. Of course, the 
book can be read easily without the reader’s delving into the details of each proof. 

The book is organized as follows. In Part I, we concentrate on performance anal- 
ysis techniques. Specifically, in Chap. 2, we introduce the concepts, and associated 
properties, of convexity, supermodularity, and discrete convexity. In Chap. 3, we 
provide a concise introduction to some of the key concepts and results in game 
theory. In Chap. 4, we discuss some of the basic tools required to perform worst- 
case analysis, while in Chap. 5, we cover average-case analysis. Finally, in Chap. 6, 
we investigate the performance of mathematical programming-based approaches. 

Part II concentrates on production and inventory problems. We start with lot 
sizing in two different deterministic environments, one with constant demand 
(Chap. 7) and the second with varying demand (Chap. 8). Chapter 9 focuses on 
stochastic inventory models, while Chap. 10 presents new results for the coordina- 
tion of inventory and pricing decisions. The chapter distinguishes between models 
appropriate for risk- neutral and risk- averse decision makers. 

Part III deals with supply chain design and coordination models. These include 
Chap. 11, which focuses on competition and collaboration models in supply chains, 
and Chap. 12 on effective supply contracts, such as buy back, revenue-sharing, 
and portfolio contracts. Chapter 13 deals with process flexibility, and Chap. 14 ad- 
dresses models that integrate production, inventory, and transportation decisions 
across the supply chain. Finally, Chap. 15 analyzes distribution network configu- 
ration and facility location, also referred to as site selection, problems. 

In Part IV, we consider vehicle routing problems, paying particular attention 
to heuristics with good worst-case or average-case performance. Chapter 16 con- 
tains an analysis of the single-depot capacitated vehicle routing problem when 
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all customers have equal demands, while Chap. 17 analyzes the case of customers 
with unequal demands. In Chap. 18, we perform an average-case analysis of the 
vehicle routing problem with time window constraints. We also investigate set- 
partitioning-based approaches and column generation techniques in Chap. 19. 

In Part V, we look at the practice of logistics management and in particular 
at issues related to the design, development, and implementation of APS. Specif- 
ically, in Chap. 20, we look at network planning issues from logistics network de- 
sign, through inventory positioning, all the way to resource allocation. Finally, in 
Chap. 21, we report on the development of a decision support tool for school bus 
routing and scheduling in the City of New York. 



Part I 



Performance Analysis 
Techniques 



2 

Convexity and Supermodularity 



The concepts of convexity and supermodularity are important in the 
optimization and economics literature. These concepts have been widely applied 
in the analysis of a variety of supply chain models, from stochastic, multi-period 
inventory problems to pricing models. Hence, in this chapter, we provide a brief int- 
roduction to convexity and supermodularity, focusing on materials most relevant 
to our context. We also briefly introduce some concepts and results from discrete 
convex analysis, which interestingly is an elegant combination of both convexity 
and submodularity. For more details, readers are referred to the three excellent 
books Rockafellar (1970) on convex analysis, Topkis (1998) on supermodularity, 
and Murota (2003) on discrete convex analysis. 



2.1 Convex Analysis 

2.1.1 Convex Sets and Convex Functions 

Before we present the definition of convex sets, we introduce some notations that 
will be used. Throughout this book, we use 5R n to denote an n-dimensional Euc- 
lidean space, and“C” and “c” for set inclusion and strict set inclusion, respectively. 
For a set C, we write x G C if x is an element of C. 

Definition 2.1.1 A set C C Ji n is called convex if, for any x,x' G C and A G 
[0, 1], (1 — A)x T Xx f G C . 
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Geometrically, a set is convex if and only if for any two points in the set, the 
line segment between these two points also lies in the set (Fig. 2.1). Here are some 
simple examples of convex sets: an interval in 5ft 1 ; a disk and a square in 5ft 2 ; a 
sphere and a cube in 5ft 3 . Also note that a set of solutions of a system of linear 
inequalities, that is, {x G 5ft n : Ax < b}, is convex, where A is a linear mapping 
from 5ft n to 5ft m and b is a vector in 5ft m . Finally, the intersection of convex sets is 
also convex, and convexity is preserved under a linear transformation; namely, the 
set AC + b := {Ax + b\x G C} is still convex if C is. 




Convex sets Nan-convex sets 



FIGURE 2.1. Examples of convex sets and nonconvex sets 



Definition 2.1.2 Given a convex set C in 5ft n ; a function f : C 5ft is convex 
over set C if for any x, x' G C and A G [0, 1], 

/((! - A)x + Ax') < (1 - A )f(x) + Af(x'). (2.1) 

/ is strictly convex if the inequality (2.1) holds strictly for any x,x' G C with 
x 7 ^ x' and A G (0,1). Finally, f is called (strictly) concave if —f is (strictly) 
convex. 

Remark: When / is (strictly) convex over 5ft n , we simply say that / is (strictly) 
convex. From now on, we mainly focus on the case when C = 5ft n to simplify our 
presentation. In fact, almost all the results about convex functions defined over 5ft n 
hold for convex functions defined over a convex subset of 5ft n , possibly with minor 
modification. 

Sometimes it is convenient to work with functions that take the value of infinity. 
In this case, for a given convex set C in 5ft n , a function / : C 5ft is convex over C 
in the extended sense if the inequality (2.1) still holds for /, where 5ft = 5ft U {oo}. 
The arithmetic convention here includes 00 + 00 = 00, 0 • 00 = 0, and a • 00 = 00 
for a > 0. Of course, usually we can restrict ourselves to the effective domain of 
function /, which is defined as follows: 



dom (/) := {x G C \ f(x) < 00}. 
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But on some occasions, it is more economical to use convex functions in the ex- 
tended sense. Finally, for a convex function / : C -A 5ft, define 

ft x \ = / f( x ), if x e e, 

^ oo, otherwise. 

It is easy to see that /:5ft n ^5ftU{oo}is convex in the extended sense. 

It is also interesting to point out that if a function / : 5ft n — )> 5ft is continuous, then 
the convexity of the function / is equivalent to the following midpoint convexity: 
For any x,x' G 5ft n , /((x + x')/2) < l/2(/(x) + /(x')). This is left as an exercise. 

We now establish a connection between convex sets and convex functions through 
the epigraph mapping. The epigraph of a function / : C -A 5ft is defined as 

epi (/) := {(x,o) \ x G C, o G 5ft, /(x) < <a}. 

It is easy to verify that a function / is convex on a convex set C if and only if its 
epigraph epi (/) is convex. The epigraph mapping allows us to translate properties 
of convex sets into results about convex functions. 

The graphical meaning of a convex function is clear; see Fig. 2.2 for an illustra- 
tion. In fact, a function / is convex if and only if for any given x and x', the curve 
((1 — \)x + Ax'), /(( 1 — \)x + Ax')) for A G [0, 1] always lies below the line segment 
connecting two points (x, /(x)) and (x', /(x')). Obviously, a linear function is both 
convex and concave. 




In the following, we summarize some useful properties of convex functions. The 
proof for those properties is straightforward. 

Proposition 2.1.3 

(a) Any nonnegative linear combination of convex functions is convex. That is, 
if fi : 5ft n 5ft we convex for i = 1, 2, . . . , m, then for any scalar oti > 0, 
a ifi i s a ^ so convex. 
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(b) A composition of a convex function and an affine function is still convex. 

That is , if f : > 5ft is convex, then for any linear mapping A from R n 

to R m and a vector b in f(Ax + b) is also a convex function of x £ 5R n . 

(c) A composition of a nondecreasing convex function and a convex function 
is still convex. That is, if f : 5ft 5ft is convex and nondecreasing and 
g : 5R n — )► 5ft is convex, then f(g(x)) is convex. 

(d) If fk : 5ft is convex for k = 1,2,... and lim/^oo fk(x) = f(x) for any 

x £ 5ft n ; then f(x) is convex. 

(e) Assume that a function /(•,•) is defined on the product space 5ft n x 5ft m . 
If f(-,y) is convex for any given y £ 5ft m ; then for a random vector ( in 
5ft m ; E^[f(xX)] is convex, provided it is well defined. As a special case, if 
f : 5ft n — )► !ft is convex, then E^[f{x — £)] is also convex. 

A weaker definition of convex functions is quasiconvex, which is also commonly 
used. 

Definition 2.1.4 A function f : 5R n — )> 5R is called quasiconvex on a convex set C 
if, for any x, x' £ 5R n and A £ [0, 1], 



The quasiconvexity of function / : 5R n 5R is equivalent to the fact that — / is 
unimodal. That is, if x* is a global maximizer of function — /, then for any x £ 5R n , 
— /(( 1 — X)x + Ax*) is a nondecreasing function of A for A £ [0, 1]. 

For a function / : C — » 5ft and a given a £ 5ft, define the level set of / as 



One can show, from the definition of quasiconvexity, that / is quasiconvex on a 
convex set C if and only if the level set L a (f) is convex for any a £ 5ft. 

2.1.2 Continuity and Differentiability Properties 

In this section, we discuss the continuity and differentiability properties of convex 
functions. Before we proceed to prove the continuity of convex functions, observe 
that the convexity of a function / : 5R n — » 5ft is equivalent to the following: For any 
x l £ 5R n , \i £ 5ft with Xi > 0 (i = 1, 2, . . . , m) and YffLi A* = 1, 



/(( 1 - A)x + Ax') < max{ f(x),f(x')j. 



L a (f) ■■= {x e C | f(x) < a}. 



m 



m 




( 2 . 2 ) 



This is a special case of Jensen’s inequality. 
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Theorem 2.1.5 If f : 5R n — >> 5ft is convex, then it is continuous. 

Proof. We only need to show, without loss of generality, that / is continuous at 
x = 0. 

First, we argue that f{pc) is bounded above over the set S = {x G 5ft n | ||x||i = 
£”=i W < 1}. Let ei (i = 1, 2, . . . ,n) be a unit vector in 5ft n with 1 at its ith 
component and 0 at other components, and let e n +i = — a (i = 1, 2, . . . , n). Then 
for any x £ S, there exists A^ > 0 (i = 1,2,..., 2 n) with YlfZ i A* = 1 such that 
x = i Therefore, (2.2) implies that /(x) < max^i ? 2 ,..., 2 n /( e z)- 

We now show that for any sequence x k G 5ft n (fc = 1,2,...,) convergent to x = 0, 
f(x k ) converges to /( 0). Since x k converges to 0, we can assume without loss of 
generality that x k G S for all k. The definition of convex functions implies that 

/(**)<(!- 11**110/(0) + ||^|| 1 /(xV||^|| 1 ). 

Letting k tend to infinity, we have 

lim f(x k ) < /( 0), 

k — yoo 



where lim denotes the upper limit of a sequence. Also, observe that 0 = (1 — 
) xk + llijJii, (~ A S ain ’ the de fi nition of convex functions implies 

that 



x k \\i 



/(0) - (1 - 1 TWh )f[x ] + 



l|:rfe|11 -f(-x k / n^iii) 



l+\\x k \\ 



1 



When k goes to infinity, we have 



/( 0) < lim f(x k ), 

k — yoo 



where lim denotes the lower limit of a sequence. Thus, lim/^oo f(x k ) = /( 0) for any 
sequence x k convergent to x = 0, and therefore / is continuous 

at 0. I 

It is appropriate to point out that if the domain of a convex function is not 
the whole space, it may not be continuous at the boundary points [for instance, 
/ : [0, 1] -A 5ft with f(x) = 0 for x G (0, 1] and /( 0) = 1]. However, it is always 
continuous at the interior points of its domain. A natural question is whether it is 
also differentiable at these interior points. Unfortunately, it is not always the case. 
For example, the absolute value function \x\ is convex while not differentiable at 
x = 0. Even though a convex function may not be differentiable, we will show in 
the following that for a convex function, its directional derivative always exists and 
possesses nice properties. Recall that for any x,y G 5R n , the directional derivative 
of a function / : 5R n — >• 5ft at x is defined as follows: 



/'(*;») :=lim 



f(x + ty) - f(x) 



t 
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For a function f defined on one-dimensional space, let fL(x) = f'(x: 1) and 
f f _(x) = -1), and define 



D f (x,t) 



f( x + 1) - fjx) 
t 



The following result, which has been widely applied, is helpful in establishing the 
monotonicity properties of ff and f'_. 

Proposition 2.1.6 Assume that a function f : 5ft — >> 5ft is convex. Then for any 
x, x', t, t' G 5ft with x < x' and 0 <t<t' or x < x + t < x' < x' + t' , 



Df(x,t) < Df(x',t'). (2.3) 

In particular , when t = t' , f has increasing differences ; that is, for any x < x' , 
t > 0, 

f{x + t) - f(x) < f(x' + t)~ fix'). 



Proof. Observe that x < x + t, x' < x r + t. There exist A, A' G [0, 1] such that 
x + t = (1 — X)x + \{x' + t) and x' = (1 — \')x + X'(x' + t). The definitions of A 
and X' imply that A + A' = 1. From the convexity of /, we have 

f(x + 1) < (1 - A )f(x) + A f(x' + 1) 

and 

fix') < (1 - A ')f{x) + A 'fix' + 1). 

Adding the two inequalities together and rearranging terms, we have that 

/ (x + t) - f{x) < fix' +t) - fix'). (2.4) 

Thus, a convex function has increasing differences. 

We now assume that 0 < t < t' . From the convexity of /, we have that 

fix' + t) < ^1-1^ fix ' ) + t-f{x' + t'), 

which immediately implies that 

fix' + t) - fix') < fix' +t') - fix') 
t - t' 

This inequality, together with the inequality (2.4), implies the inequality (2.3). 

Finally, assume that x < x + t < x' < x' + t' . Again, the convexity of / implies 
that 

fix + 1) < (1 - A )f(x) + A fix') 

and 

fix') < (1 - A ')f{x + t)+ A 'fix' + 1'), 
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where A = and A' = • The a bove inequalities are equivalent to 

the following: 

/(re + t)- f(x) < f(x') - f(x + t) 
t ~ x' — (x -\-t) 

and 

/(x') - f(x + t) < f(x' + t’) - f{x') 
x' — {x + t) — V 

Therefore, the inequality (2.3) holds if x < x + t < x' < x' + t' . The continuity of 
convex functions implies that (2.3) is still true if x < x + t < x' < x' + t' . I 

Theorem 2.1.7 Assume that f : 5ft — >• 5R is convex. Then 

(a) f' + and f'_ are well defined, and for any x,x' G 5ft with x < x' , f'_(x) < 

r + {x) < 

(b) For any fixed x G 5 ft, 



f{pc Ft) — f{x) > £t V t G 5ft 
if and only if £ G [f_(x),f^(x)]. 



Proof. From (2.3), we know that Df(x,t ) is a nondecreasing function of t for 
t > 0 (simply let x' = x) and is bounded below by Df(x',t') for any x < x' and 
0 < t' < x' — x. Therefore, 



/+( x) = inf Df(x,t). 

Similarly, D f(x — t,t) is nonincreasing in £, and hence 

f'_ (x) = sup D f(x — t,t). 
ti o 

Indeed, define a new convex function g(x) = f(—x) and the nonincreasing property 
of Df(x — t, t) follows from applying Proposition 2.1.6 to the function g. 

Again from (2.3), it is easy to see that for any x < x' and 0 <t<x' — x, 

Df{pc — t,t) < Df(x,t) < Df(x' — £, t). 

Letting t f 0 yields f'_(x) < ff(x) < f'_(x'). Thus, part (a) is true. 

Finally, part (b) is a direct consequence of the proof for part (a). I 

Theorem 2.1.7 part (b) implies that for any £ G [f'_(x),f+(x)\, the function 
f(x') always lies above the line L x = {(x',f(x) + — x)) \ x' G 5ft } for any 

x\ see Fig. 2.3 for an illustration. This result can be extended to convex functions 
defined in 5ft n . For this purpose, we introduce the concept of subgradient. 
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x f x 

FIGURE 2.3. Illustration of the definition of subgradient 



Definition 2.1.8 Given a function f : 5ft ; £ G 5ft n is a subgradient of the 

function f at x G 5ft n if, for any t G 5ft n ; 

f(x + t)~f(x) > (£,t), (2.5) 

where (£, t) = ^ ^ e inner product between f and t. Let the subdifferential 

df(x) be the set of all subgradients of f at x. 

The following theorem characterizes properties of subgradients. The proof of 
these properties is omitted since it is quite involved; see Rockafellar (1970) for 
more details. 

Theorem 2.1.9 Assume that a function f : 5ft n — >• 5ft is convex. The following 
results hold. 

(a) For any x G 5R n ; df(x) is nonempty, convex, and compact. 

(b) For any x,y £ the directional derive of f can be expressed as follows: 

f(x;y)= sup (£,y). 

£edf(x) 

(c) f is differentiable at x E 5R n if and only if df(x) = {V f(x)}, where V/(x) 
denotes the gradient of f at x. 

(d) For any compact set C C 5R n ; U X ecdf(x) is compact. 

Remark: If the domain of / is not the whole space, Theorem 2.1.9 may fail at 
the boundary points of its domain. For example, consider the convex function 
/ : [0, 1] — )> 5ft with /( 0) = 1 and f(x) = 0 for x G (0,1]. Clearly, it is not 
continuous and has no subgradient at x = 0. 

We now present the general form of Jensen’s inequality. 
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Proposition 2.1.10 Let f be a convex function over 5ft and ( a random variable 
with finite expectation E[Q. We have 

m®<E[f(Q]. 



Proof. This proposition can be proven by using the special case of Jensen’s in- 
equality (2.2) and the continuity of convex functions as well as the definition of 
expectations. We present an alternative approach based on the properties of sub- 
gradients. 

Choose any £ E df(E[(]). From the definition of subgradients, we have that 

/(C) - >{£,(- £[C])- 

Taking expectations on both sides yields f(E[Q) < E[f(()]. I 



2.1.3 Characterization of Convex Functions 

The concept of convexity is widely used in optimization. However, identifying 
convex functions is not always simple. In this section, we give some sufficient and 
necessary conditions for a differentiable function to be convex. 

Theorem 2.1.11 Consider a function f : 5ft n 5ft. 

(a) If f is differentiable, then f is convex if and only if for any x,x' E 5ft n ; 

f(x’) - f(x) > (Vf(x),x' - x), (2.6) 

where (x,y) = Y^i=i x iVi i s the inner product of x,y E 5ft n . 

(b) If f is differentiable, then f is strictly convex if the inequality (2.6) holds 
strictly for any x x' . 

(c) If f is continuously differentiable, then f is convex if and only if V / is 
monotone ; that is, for any x,x' E 5ft n ; ( V/(x ') — V f(x),x' — x) >0. 

(d) If f is twice continuously differentiable, then f is convex if and only ifX7 2 f(x) 
is positive semidefmite for any x E 5ft n . 

(e) If f is twice continuously differentiable, then f is strictly convex ifX7 2 f{x) 
is positive definite for any x E 5ft n . 

(f) Assume that f(x) = x T Qx for a symmetric matrix Q of order n and x E 5ft n . 
Then f is convex if and only if Q is positive semidefmite. And f is strictly 
convex if and only if Q is positive definite. 
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Proof. Assume that / is differentiable. Pick any x, x' G 3? n and define 

0(A) ■= f(x + A(x' -x)). 

First, notice that f{pc) is convex in x if and only if 0(A) is convex for A G [0,1] for 
any picked x,x* G 5P n . Also, observe that 

0'(A) = (V/0r + X(x' -x)),x' -x). 

If 0(A) is convex in A, then Theorem 2.1.7 part (b) implies that 

f{x') - f{x) = 0(1) - 4>{ 0) > 0'(O ) = (V/(x), x' - x). 

Hence, the inequality (2.6) holds for any x,x f G 5ft n . On the other hand, if the 
inequality (2.6) is true for any x,x f G 5P n , then for any z = (1 — X)x + Xx' with 
A G [0, 1], we have 

f(x)-f(z) > (Vf{z),x-z) 

and 

fix') - f(z) > (V f{z),x J - z). 

Multiplying the first inequality by (1 — A) and the second inequality by A and 
summing them up, we end up with 

/((! - A)x + Ax') < (1 - A)f(x) + Af(x'). 

Thus, / is convex and part (a) is true. Obviously, from the above argument, one 
can see that / is strictly convex if the inequality (2.6) holds strictly. Therefore, 
part (b) holds. 

For part (c), notice that if 0(A) is convex, then Theorem 2.1.7 part (a) implies 
that 0'(O) < 0 / (l). Thus, 

(V/0P) - Vf(x),x' - x) = 0'(1) - 0'(o) > 0. 

On the other hand, if V/ is monotone, then for any A' > A > 0, 

0 , (A / ) = (V/(x + X\x' - x)),x' - x)) > (Vf(x + X{x' - x)),x' - x)) = 0'(A). 

Therefore, 0'(A) is nondecreasing, and hence 

0(A) = 0(0) + [ X 0'(£K < m + A0'(A) (2.7) 

J o 

and 

0(A) = 0(1) - jf 0'(OdC < 0(1) - (1 - A)0'(A). (2.8) 

Multiplying the first inequality by (1 — A) and the second inequality by A and 
summing them up, we end up with 



0(A) < (1 — A)0(O) + A0(1); 



(2.9) 
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that is, (j) is convex for A G [0, 1]. Thus, / is convex. Also, notice that from the 
above proof, V/ is monotone if and only if <//(A) is nondecreasing for A G [0, 1]. 
We now assume that / is twice continuously differentiable. In this case, 

</)"( A) = (x' — x, V 2 /(x + X(x' — x))(x' — x)}. 

Notice that for any 0 < A < A' < 1, 



- 4f{\) = J* 

Therefore, if V 2 /(x) is positive semidefinite for any x G 5ft n , then 0"(£) > 0 for 
any £, and hence </>'( A) is nondecreasing, which in turn implies that / is convex 
as we already proved for part (c). On the other hand, the convexity of / implies 
that <f f is nondecreasing, which in turn implies that </>"( A) > 0 for any A G [0, 1]. 
In particular, we have 



0 < 0"(O) = (x f — x , \7 2 f(x)(x' — x)). 

Since rr' G is arbitrary, we have that V 2 /(x) is positive semidefinite. This 
proves part (d). 

If V 2 /(x) is positive definite for any x G 5i n , then for x/x', <//(A) is strictly 
increasing for A G [0, 1] and at least one of the inequalities (2.7) and (2.8) holds 
as a strict inequality. Hence, the inequality (2.9) holds strictly. This implies that 
</>, and therefore / is strictly convex. Thus, part (e) holds. 

Finally, for part (f), we only need to prove that if / is strictly convex, then Q is 
positive definite. The remaining results are special cases of parts (d) and (e). If / is 
strictly convex, then part (d) implies that Q is positive semidefinite. Assume to the 
contrary that Q is not positive definite. There exists a nonzero vector z G 5R n such 
that f(z) = (z,Qz) = 0. Therefore, for any x G 5ft n , f(x + \z) = f(x) + 2A(x, Qz), 
which is a linear function of A. Thus, / is not strictly convex, a contradiction. 
Hence, Q is positive definite if / is strictly convex. I 



2.1.4 Convexity and Optimization 

Convexity plays an important role in optimization theory. In particular, we show 
in the following that a local minimizer of a convex minimization problem, namely, 
the problem of minimizing a convex function, is a global minimizer of this problem, 
and the first-order optimality condition is both sufficient and necessary for a point 
to be a global minimizer. As we shall see, this result has implications both for 
optimization theory and for algorithms. 

Theorem 2.1.12 Assume that f : 3? n — >> 5ft is convex. If x* is a local minimizer 
of f, then x * is a global minimizer of f . Furthermore, x* is a global minimizer of 
f if and only if 0 G df(x*)- 
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Proof. If x* is a local minimizer, then there exists a ball B e (x*) = {x £ 5R n | \\x — 
|| 2 < e} for some e > 0 such that /(x) > /(x*) for any x £ B e (x*). Moreover, 
for any x £ 5ft n , there exists A £ (0, 1) such that (1 — A)x* + Xx £ B e . From the 
definition of convexity, we have 

f(x*) < /(( 1 - A)x* + Ax) < (1 - A)/(x*) + A/(x). 

The inequality implies that /(x*) < /(x) for any x £ 5ft n . Hence, x* is a global 
minimizer of /, and from the definition of a subgradient, we have 0 £ df(x*). 
Finally, if 0 £ 9/(x*), then the definition of a subgradient implies that /(x) > 
/(x*) for any x £ 5ft n . In other words, x* is a global minimizer of the function /. 
Therefore, x* is a global minimizer of / if and only if 0 £ df(x*). I 

The following result is a straightforward consequence of the definition of strictly 
convex functions. 

Theorem 2.1.13 A strictly convex function f : Ji n — 5ft has at most one local 
and global minimizer. 

We now consider the convex function maximization problem. If / : 5ft — )> 5ft is 
convex, then from the definition of convexity, one can see that either a or b is 
an optimal solution for the problem max^^] /(x). More generally, we have the 
following result regarding the convex function maximization problem. 

Theorem 2.1.14 Assume that a set C C 5R n is compact and f : 5R n — )► 5ft is 
convex. Then max^c /(x) achieves maximization at an extreme point x* of C. 
That is, there exists no x, x' £ C with x/x' such that x* = (x + x')/2. 

We provide some intuition to the theorem instead of a formal proof. Assume that 
a maximizer of /(x) over the set C, x*, is not an extreme point of C. Then there 
exist x, x' £ C with x/x ; such that x* = (x + x')/2. Let L = {x* + t[x’ — x) £ 
C\t £ 5ft} be a line segment in C. It is clear, from the definition of convexity, that 
all points on the line segment L are maximizers of the function /(x). Let x be one 
of the endpoints of L. If x is an extreme point of (7, we are done; otherwise, we 
can repeat the above process and the theorem follows since such a process cannot 
proceed an infinite number of times. 

The following proposition shows that under some conditions, convexity is pre- 
served under optimization operations. 

Proposition 2.1.15 Given a function /(•,•) defined on the product space Ji n x 

(a) If f{-,y) is convex for any given y £ then for any set C C $i m ; the 
function h:5ft n ^5ftU{oo} defined by 

h{x) := sup f(x,y) 

yec 

is also convex (possibly in the extended sense). 
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(b) Assume that for any x G there is an associated set C{pc) C and 
C := {(x, y) | y G C(x),x G 5ft n } is convex. If f is convex and the function 

g{x):= m£ f(x,y) 
yec(x) 

is well defined, then g is also convex over 5R n . 

Proof. To prove part (a), observe that for any given y G (7, the set 

e Pi (f(-,y)) = {(x,a) | x e W,a e 3t,f(x,y) < a} 

is convex. Therefore, epi(h) = n 2/ ecepi(/(-, y)) is a convex set, which implies that 
h is convex. 

For part (b), let us fix x,x' G 5R n and A G [0,1]. From the definition of an 
infimum, there exists, for any given e > 0, y, y' G C such that 

f(x, y) < g(x) + e and f(x', y') < g(x') + e. (2.10) 

Since C is convex, we have that ((1 — X)x + Ax', (1 — A )y + A y') G C. Thus, 
(1 — A )y + A y r G C(( 1 — \)x + Ax') and 

g({ 1 - A)x + \x') < /(( 1 - \)x + Ax', (1 - A )y + Ay') 

< (i — y / ( x : y) + \f (x 1 , y') 

< (1 — A)g(x) + A g(x') + e, 

where the second inequality holds since /(•,•) is convex and the last inequality 
follows from (2.10). Since e is arbitrary, we conclude that g is convex. I 



2.2 Supermodularity 

In this section, we introduce the concept of supermodularity. Though this concept 
can be defined on general partially ordered sets, for our purpose we focus on the 
Euclidean space !ft n with the standard partial order <; that is, for any x,x' G 5ft n , 
x < x' if and only if x^ < x[ for i = 1 , 2, . . . , n. 

To present the definition of supermodularity, we first introduce two 
operations, join and meet operations, in Ji n . For any two points x = (xi , X 2 , . . . , x n ) 
and x' = (x' l5 x ' 2 , . . . , x' n ) in 5R n , define their join as 

xVx' = (maxjxi, x[}, max{x 2 , x ' 2 }, . . . , max{x n , x' n }), 

and their meet as 

xAx' = (minjxi, x[}, min{x 2 , x ' 2 }, . . . , min{x n , x^}). 

Of course, if x < x', namely, Xi < x' for i = 1, 2, . . . , n, then x V x' = x' and 
x A x' = x. A set X C 9? n is called a lattice if for any x, x' G X, x V x', x Ax' G X . 
Note that X is called a sublattice of 5R n in some literature as it inherits the infimum 
and supremum from 5R n . 
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Definition 2 . 2.1 Suppose X is a lattice in 5ft n and a function f : X 5ft. The 
function f is supermodular on the set X if for any x,x' E X, 

f(x) + f{x f ) < f{x V x') + fix A a;'). (2.11) 

/ zs strictly supermodular if the inequality (2.11) holds strictly for unordered pairs 
x and x' ; that is, none of x < x' nor x > x' is true. A function f is (strictly) 
submodular if —f is (strictly) supermodular. 

A closely related and more intuitive concept is (strictly) increasing differences. 
Given X C 5ft n and T C 5ft m , a function / : X x T — » 5ft has (strictly) increasing 
differences if for any t,t' E T with t < t' (t < £'), fix,t') — f(x,t) is (strictly) 
increasing in x E X. Notice that a function g : X — )► 5ft is called increasing if for 
any x,x' E X with x < x ' , g(x) < gix'). 

This concept can be extended to functions defined on sets with a product struc- 
ture Uf =1 Xi, where JQ is a subset of some Euclidean space. For this purpose, we 
define, for a function / : Uf =1 Xi — )> 5ft, any pair of indexes i,j E {1, 2, . . . , n} and 
any vector 

Xij = {x \ , • • • , 1 , Xi+ 1 , • • • , , ^Tj+1 5 • • • 5 ^n) C jX/ , 

a function 

/xij (^i? Xj) — f (^ 1 , • • • 5 ^ 2 — li Xi , • • • 5 Xj_l, Xj , • • • , X n ). 

The function / : Uf =1 Xi !ft has (strictly) increasing differences if the function 
fxij(xi,Xj ) has (strictly) increasing differences for any pair of distinct indexes 
i,j E {1,2, . . . ,n} and any vector x {j E Ui=i :n y^i,jXi. 

The following result shows that for functions defined on 5ft n , the concept of 
supermodularity is equivalent to the property of increasing differences. 

Theorem 2 . 2.2 A function f : 5ft n 5ft is (strictly) supermodular if and only 
if f has (strictly) increasing differences when 5ft n is regarded as the product of n 
one- dimensional real space. 

Proof. Assume that / has increasing differences. Then, for any x , x' E 5 ft n , we have 

f(x) - f(x Ax') = J27=i(f( x i,---,Xi,x i+1 Ax' i+1 ,...,x n Ax' n ) 

- f(x 1 ,...,Xi- 1 ,x i Ax' i ,x i+i Ax' i+1 ,...,x n Ax' n )) 

- f(xi Vx' u ...,Xi~iVx' i „ i ,x i Ax' i ,x' i+1 ,...,x' n )) 

= E"=l (f( X 1 V x'j , . . . , V , Xi V x' , x' + 1 ,...,x' n ) 

- f(x 1 Vxi,...,Xi_i Vx'_ 1 ,x',x' + 1 ,...,x^)) 

= /(x Vi') - /(x'), 

where the inequality holds since / has increasing differences and the second equal- 
ity holds since V A £•}. Hence, / is supermodular. 
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Assume now that / is supermodular. For any pair of distinct indexes i,j G 
{ 1 , 2 ,..., n}, any vector 

x ij ( X\ , • • • , %i— 1 1 3 ? 2 -|-l 5 • • • i x j— 1 •> x j-\- 1 x n) ^ ^ 

and Xi , x • , x j , x r - G 5ft with < x • and Xj > x f - , let 



(xi,.. 


• : X{—\ , X { , X{j^\ , . . 


• 5 x j— 1 5 5 ^j+l 5 • • 


• 5 ^n) 


(xi,.. 


• 5 X i— 1 5 X i 5 ^i +1 5 • • 


+ 

7 


■ ,x n ) 



The supermodularity of / implies that 

/a y ( Xi,Xj ) - / Sy (a;,, x' ) = /(x) - f(x A x') 

< /(x V x') — /(x') 

= fxi^x^Xj) -/sy(x',x'). 

Thus, / has increasing differences. 

Finally, the equivalence of the strict supermodularity and the strictly increasing 
differences can be established by following a similar argument. I 

If a function / : 5ft n — » !ft is differentiable, it is easy to verify that / is supermod- 
ular if and only if the partial derivative d ^ x / > is nondecreasing in Xj for all distinct 
indexes i and j and for any x G 5ft n . Furthermore, if / is twice differentiable, then 
/ is supermodular if and only if — 0 f° r an Y distinct indexes i and j and 

for any x G 5ft n . 

From the definition of supermodularity, we can easily conclude that a separable 
function / : 5ft n — >• 5ft is both supermodular and submodular. In fact, the reverse 
is essentially true. 

Theorem 2.2.3 A function f : !ft n — » !ft is both supermodular and submodular if 
and only if f is separable ; that is, there exist functions fi : 5ft — >■ 5ft (i = 1, 2, . . . , n) 
such that f(x) = Y^i=i fi( x i) f or an y x ~ ( x u x 2, • • • , x n) £ 5ft n . 

Proof. The “if” part is obvious, since as we already pointed out, a function defined 
on 5ft is both supermodular and submodular. Hence, we focus on the “only if” part. 
Assume that / : 5ft n — >• 5ft is both supermodular and submodular. Then 

f(x) = /( 0) + £r=i(/(®i>- • • ,Xi-i,Xi,0, ... ,0) - f(x 1 , . . .,Xi- 1,0, 0, . . . , 0)) 

= /( 0) + EILi (/( 0, . . . , 0, Xi, 0, . . . , 0) - /(0)), 

where the second equality holds since from Theorem 2.2.2, / has both increasing 
differences and decreasing differences. Therefore, / is separable. I 

In the following, we present some examples of supermodular functions, whose 
proof is left as an exercise. 
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Theorem 2.2.4 

(a) The function f{x, z) = %iZi is supermodular on the product space 5ft n x 
5ft n ; where x, z G 5ft n . 

(b) The Cobb— Douglas function f(x) = x^x^ 2 for oti > 0 is super- 

modular on the set {x\x = (ad, # 2 , • • • , x n ) > 0}. 

(c) The function f(x, z) = — Y^=i \ x i ~ z i\ P * 5 supermodular on (x, z) G 5ft 2n for 
any p > 1 . 

(d) If fi(z) is increasing (decreasing) on 5ft for i — 1, 2, . . . , n ; tften £/ie function 
f(x) = min iG ^ lj 2 ,...,n} fi{ x i ) 2 S supermodular on 5 ft n . 

We now list below some useful properties about supermodular functions. Some of 
these properties are similar to Proposition 2.1.3, which deals with convex functions. 

Proposition 2.2.5 

(a) Any nonnegative linear combination of supermodular functions is supermod- 
ular. That is, if fi : 5ft n — )> 5ft (i = 1, 2, . . . , ra) are supermodular, then for 
any scalar oii > 0 , YliLi a ifi * s s ^ supermodular. 

(b) If fk is supermodular for k = 1 , 2 ,... and lim/^oo fk(x) = f(x) for any 
x G 5ft n ; then f(x) is supermodular. 

(c) A composition of an increasing (decreasing) convex function and an increas- 
ing supermodular (submodular) function is still supermodular. That is, if 
f : 5ft 5ft is convex and nondecreasing (nonincreasing) and g : 5ft n 5ft is 
increasing and supermodular (submodular), then f{g(x)) is supermodular. 

(d) Assume that a function /(•,•) is defined in the product space 5ft n x 5ft m . If 

f(-,y) is supermodular for any given y G 5ft m ; then for a random vector ( in 
5 ft m ; E c [f(x,C)] supermodular, provided it is well defined. 

(e) Assume that A is a lattice in 5ft n x 5ft m and a function /(•,•) : A — >> 5ft is 
supermodular. For any x G 5ft n , let S(x) = {y G 5ft m | (x,y) G A} and 
Hy(A) = {x G 5ft n | S(x) 7 ^ 0}. Then II y (A) is a lattice. If the function 

//(•'•) sup f(x,y) 

yeS(x) 

is finite on U y (A), then g is supermodular over U y (A). 



Proof. Parts (a), (b), and (d) follow directly from the definition of supermodular 
functions. 

We now prove part (c). Assume that g is increasing and supermodular and / is 
convex and nondecreasing. For any x,x' G 5ft n , since g is increasing, we conclude 
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that g{pc A x') < g{x),g{x r ) < g{x V x f ). Therefore, there exist A, A' G [0, 1] such 
that 

g(x) = (1 — A )g(pc A x') + A g(x V x') and g(x') = (1 — A ')g(x A x') + A'g(rr V x'). 

Adding the two equalities together gives us 

g(x) + g(x') = (2 — A — A ')g(x A a/) + (A + A ')g(x V x f ). 

Since g is supermodular, we must have either g{x Ax') = g(x V x') or A + A' < 1. 
In the first case, clearly 

f(g(x)) + = f(g(x A x')) + f(g(x V x')). 

In the second case, 

f(g(x)) + f(g(x')) < f(g(x A x')) + f(g(x V x')) 

+ (1 - A - \'){f(g(x V x')) - f(g(x A x'))) 

< f(g(x Ax')) + f{g{x Vi')), 

where the first inequality follows from the convexity of function / and the second 
inequality holds since A + A' < 1, / is nondecreasing, and g is increasing. Hence, 
f(g(x)) is supermodular. Obviously, the above argument holds true when / is 
convex and nonincreasing and g is increasing and submodular. 

For part (e), for any x,x' G n y (A) and any y G S(x),y' G S(x') : we have that 

(x Ax f ,yA y') G A and {x Vx ; ,yV y') G A. 

Thus, yAy' G S(xAx') and y\Zy' G S{x\/x '), which imply that xAx f , xVx' G n y (A) 
and n y {A) is a lattice. From the supermodularity of /, we have that 

f(x, y) + f{x', y') < f{x Ax',y A y') + f{x \/x',y\/ y') 

< g(x A x') + g(x V x'). 

Finally, taking the supremum for y G S{x) and y ' G S(x') in the left-hand side of 
the above inequality, we have 

g{x) + g(x') < g(x A x') + g{x V x'). 

Thus, g is supermodular on H y (A). I 

The following result establishes some connections between convexity and super- 
modularity. 

Theorem 2.2.6 Let X be a lattice in and ai G 5R (i = 1,2, ...,n). For a 
function f : 5R 5R, define g : 5R n — )> 5R with g(x) := Li a i x i) f or an V 
x = (xi, ^ 2 , . . . , x n ) G 5R n . We have the following. 

(a) If ai > 0 for i = 1, 2, . . . , n, and f is convex, then g is supermodular on X. 
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(b) If n = 2, a\ > 0, a 2 < 0, and f is concave, then g is supermodular on X. 

Suppose, in addition, that for any x,x' with x < x' , x £ X implies x' E X, 
5ft = 1 aiXi I x ^ X}, and f is continuous. 

(c) If n > 2, ai > 0, a 2 > 0, and g is supermodular on X, then f is convex. 

(d) If n > 2, ai > 0, 02 < 0, and g is supermodular on X, then f is concave. 

(e) If n > 3, ai > 0, 02 > 0, as < 0, and g is supermodular on X, then f is a 

linear function. 

Proof. First, observe that for any x,x' E 5ft n , 

n n n n 

E a * x * - ^ ~2ai(xi Ax') = E a *( x * V x') - y^a^x'. 

2 = 1 2 = 1 2 = 1 i—i 

We now prove part (a). Since Oi > 0, we have 

n n n n 

Y,ai(xi Ax') < cqx' < E a *(^Vx'). 

2=1 2=1 2=1 2=1 

Therefore, there exist A, A' E [0, 1] such that 

n n n 

= (1 — A) ^ A x'f) + A y^ Oi(xi V £ •) 

and 



(2.12) 






2=1 



2=1 



y: = (1 - A') £ a* (a;* A £•) + A' y^ a*(xi V x'f). 



2=1 



2=1 



2=1 



Moreover, (2.12) implies A + A' = 1. Thus, from the convexity of /, we have that 

g(x) + p(^') = /(Elk + /(Elk a k) 

< /(Elk A x 0) + /(Elk v *■)) 

= g{pc A x') + g(x V #'). 

Hence, g is supermodular on X. 

For part (b), we argue that for any x = (£ 1 ,^ 2 ) and x' = (x^x^), 



f(aixx + a 2 x 2 ) - f(ai(xx A ^i) + a 2 (x 2 A x' 2 )) 

< f(ai(xi V xi) + a 2 (x 2 V x' 2 )) - /(aixi + a 2 x 2 ). 



(2.13) 



If x < 2 / or x > x', it is obvious that the inequality (2.13) holds true. Otherwise, 
assume, without loss of generality, that x% = x\ V x[ and x 2 = x 2 A x' 2 . We have, 
from (2.12), that 

22 2 2 

y^a^ - y^Oi(xi Ax'j) = 5>0* V x') - y^a^x- > 0. 



2=1 



2=1 



2=1 



2=1 
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It is also easy to verify that 



a\X\ + a 2 x 2 > ai(xi V x[) + a 2 (x 2 V x' 2 ). 



Hence, Proposition 2.1.6, together with the above inequality, implies the inequality 
(2.13), and thus g is supermodular. 

To prove part (c), fix any z,z' G 5ft with z < z'. Choose x G 5ft n such that 
aiXi — z - Let 6 = ( z ' — z )/(^ a i) > 0 and S = (z' — z) / (2n 2 ) > 0 . Also, let 
y = x + eei, y r = x + Se 2 , and x' = x + eei + 5e 2 , where e* is the unit vector with 
0 at all components except 1 at its ith component. From the definitions of e and 
£, we have that Ym=i a iVi = a *2/i = (^ + z ')/ 2 an d X^=i = Hence, 

the supermodularity of g implies that 

/( 0 + z')/2) - l/ 2 (/0) + /(*')) = l/2(5(2/) + 9(y) - 30*0 - 30')) < 0, 

since x — y Ay' and x' = yVy' . Thus, the convexity of the function / follows from 
this inequality and the continuity of /. 

We now prove part (d). Fix any z, z' G 5ft with z < zb Choose x G 5ft n such that 
SlLi = (^ + £')/ 2 . Let e = (z' — z)/(2ai) > 0 and S = — {z' — z)/(2a 2 ) > 0 . 
Also, let y = x + eei, y' = x -\- Se 2 , and x' = x + eei + £e 2 . From the definitions of 
e and 5, we have that = z ' ^ = and X)”=i = {? + ^O/ 2 - 

Hence, the supermodularity of g implies that 

/(O + ^)/ 2 ) - !/ 2 (/(^) + /O')) = V 2 ( 30 ) + 30 ') - 3(3) - 3(3')) > 0 , 

since x = y Ay' and x' = yVy'. Thus, the concavity of the function / follows from 
this inequality and the continuity of /. 

Finally, if n > 3, cq > 0, a 2 >0, and a 3 < 0, the proof for parts (c) and (d) 
implies that / is both convex and concave, and hence / is linear. I 

One of the most widely used properties associated with supermodularity is on the 
monotonicity of the sets of optimal solutions of a class of parametric optimization 
problems. To present this property, define a new concept of increasing set function. 
Let S(t) be a set function in 5ft n parameterized by t G T C 5ft m ; that is, for a 
parameter t G T, S(t) is a subset of !ft n . The set function S(t) is called increasing 
in t if for any t,t' G T with t < t', x G S(t), and x' G S(t'), we have that 
x A x' G S(t) and x V x' G S(t'). 

The concept of increasing set functions is different from set inclusion. To see this, 
notice that S(t) = [t, + 00 ) is an increasing set function for t G 5ft, but S(t') C S(t) 
for t <t'. However, for an increasing set function 5, it is straightforward to show 
that given t,t' G T with t < t', for any x G S(t), there exists a point x' G S(t') 
and for any x' G S(t'), there exists a point x G S(t) such that x < x' . 

Proposition 2.2.7 Let S(t ) be an increasing set function in 5ft n parameterized by 
t G T C 5ft m . We have that S(t ) is a lattice of 5 ft n for any t G T . If, in addition, 
S(t ) is nonempty and compact for any t G T, then S(t ) has a largest element and 
a smallest element, which are increasing in t, respectively. 
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Proof. It is straightforward to see from the definition of increasing set function 
that S(t) is a lattice of 5R n for any t £ T. We now show that S(t) has a largest 
and a smallest element if S(t) is compact. Let x* = {x\, x%, • • • , £* ) G with 
x\ = s^P xe s(t) x i> i = lj2, Since £(£) is compact, there exists x l G 5'(t) 

such that 2 ^ = x*, i = 1, 2, . . . , n. Hence, x* = x 1 V x 2 V . . . V x n G 5(t) since 5(t) 

is a lattice. Obviously, x* is the largest element of S(t). 

To show that the largest element of S(t), denoted as x(t), is increasing, note 
that for any t,t' G T with t < t', x(t r ) < x(t) V x(t') G S{t'). Since x(t') is the 
largest element of S(t r ), we must have x(t') = x(t) V x(t') > x(t). 

The properties regarding the smallest element of S(t) can be established 

similarly. I 

Under some supermodularity assumptions, the sets of optimal solutions for a 
collection of optimization problems parameterized by a parameter are increasing 
in the parameter. In addition, for a given parameter, there exist a largest and a 
smallest optimal solution, which are increasing in the parameter as well. Consider 
the parametric optimization problem 

max q(x.t). 
xes(t) v ' 

Let A := {(x,t) | t G T, x G S(t)} C x 5ft m , and S*(t) := argmax^g^) 
g{x,t). 

Theorem 2.2.8 Assume that S(t ) is increasing inteT, and g(x,t) : A —)> 5R is 
supermodular in x for any fixed t G T and has increasing differences in (x,t). 

(a) S*(t ) is increasing in t on {t E T \ S*(t) 0}. 

(b) Assume , in addition, that S(t ) is a nonempty and compact set of#t n for any 
t G T, and g(x,t) is continuous in x on S(t ) for any t G T. Then S*(t ) is 
a nonempty and compact lattice and thus there exist x(t),x(t) G S*(t) such 
that for any x G S*(t), x(t) < x < x(t). Furthermore, x(t) and x(t) are 
increasing. 



Proof. To prove part (a), pick any t,t' G T with t < t' such that S*(t) and S*(t') 
are nonempty. For any x G S*(t) and x' G S*(t'), x A x' G S(t) and x V x' G S(t'), 
since S(t) is increasing in t. We have that 

0 > g(x V x', t') — g(x', t') 

> g(x,t') - g(x Ax',t') 

> g(x,t) - g(x Ax',t) 

> 0, 

where the first and last inequalities follow from the optimality of x and x' for t 
and t' , respectively, the second inequality from the supermodularity of g in x for 
any fixed £, and the third inequality from the increasing differences property of 
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g in (x,£). Thus, x A x' G S*(t) and x V x' G S*(t'), which implies that S*(t) is 
increasing in t on {t G T \ S*(t ) 7 ^ 0}. 

If g(x , t) is continuous in x and S(t) is nonempty and compact, the set of optimal 
solutions S*(t) is also nonempty and compact for any t £ T. Part (b) follows 
Proposition 2.2.7 directly. I 

To conclude this subsection, we present an extension of a fixed-point theorem due 
to Tarski (1955), which will be useful to prove the existence of a Nash equilibrium 
of supermodular games in the next section. 

Theorem 2.2.9 Let C be a nonempty and compact lattice of 5R n . If S is an in- 
creasing set function mapping any point x G C to a nonempty and compact lattice 
that itself is a subset of C, then the set of fixed points E = {x G C : x G S(x)} is 
nonempty and has a largest and a smallest fixed point. If in addition, S depends 
on a parameter t G T C 5i m and for fixed x G C, S(x,t ) is increasing in t, then 
the largest and the smallest fixed points increase in t. 

Proof. Define a set 

C = {x £ C : 3y £ S(x) such that y > x}. 

The set C is nonempty since it includes the smallest element of C, which is guar- 
anteed to exist by Proposition 2.2.7. 

Let x = sup{x : x G C}. Since C is a compact set, x is well defined. We show 
that x is a fixed point. Note that for x G C, there exists a y x G S(x) such that 
y x > x. Since S is increasing and x < x for any x G C, there exists a point 
y x G S(x) such that y x > y x - Therefore, y = sup{^ : x G C} G S(x) since 
S(x) is a compact lattice. Because y x > x for any x G (7, we have that y > x. 
Thus, there exists a point z G S(y) such that z > y and y G C. However, x 
is the least upper bound of C and y > x. We must have that x = y G S(x). 
Thus, E is nonempty and x is its largest element. Similarly, we can show that 
x = inf{x G C : 3y G S(x) such that y < x} is the smallest fixed point in E. 

If S also depends on a parameter t G T, define for a given t G T 
C(t) = {x G C : 3y G S(x, t ) such that y > x}. 

From the argument in the previous paragraph, it suffices to show that C(t) is 
increasing in t if S(x,t) is increasing in t for any fixed x G C. To see this, pick 
two points t,t' G T with t < t' and x G C(t) and x' G C(t'). By definition, there 
exist y G S(x,t) and y' G S(x',t') such that y > x and y' > x' . Since S(x,t ) is 
increasing in t for any fixed x G C, there exists a point z G S(x,t') such that 
z > y > x. Therefore, x V x' < z\l y' G S{x V x', t') and x V x' G C(t'). Similarly, 
we can show that x Ax' G C(t). Thus, C(t) is increasing in t. I 



2.3 Discrete Convex Analysis 

In this section, we give a brief introduction to discrete convex analysis. Some of 
the concepts and results there have been demonstrated useful to inventory models, 
and we expect they can find more applications in supply chain management. 
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Discrete convex analysis can be treated as an extension of convex analysis to 
combinatorial structures. It aims at building a unified theoretical framework for 
tractable discrete optimization problems. Two key concepts, Z^-convexity and M^- 
convexity defined on either real variables or integer variables, play prominent roles 
(here L and M stand for “lattice” and “matroid,” respectively). In the following, 
we use the notation T to denote either the real space 5ft or the set with all integers 
Z and T+ to denote the set of nonnegative elements in T . Recall 5ft = 5ft U +oo. 

2.3.1 1} - Convexity 

Definition 2.3.1 (L^-Convexity) A function f : T n —>>5 ft zs Lfi -convex if for 
any x, x' E T n , a E fF+, 

f(x) + f(x') > f((x + ae) A x') + f(x V (x* - ae)), (2.14) 

where e is the n-dimensional all-ones vector. A function f is -concave if —f is 
L* -convex. 

Inequality (2.14) is called translation submodular. Notice that in the above 
definition, if f{x) = +oo or f(x') = +oo, inequality (2.14) is assumed to hold aut- 
omatically. Thus, for an Z^-convex function /, its effective domain V = dom (/) = 
{x E fF n \f(x ) < Too} is an Z^-convex set; that is, it satisfies the following 
condition: 

V x, x' E V and a E J-+, (x + ae) A x' E V and xV(x ; - ae) E V. 

Let : J~ n 5ft be the indicator function of a given set V C T n \ that is, S^(x) = 0 
if x E V and +oc otherwise. It is easy to verify that the set V is Z^-convex if and 
only if its indicator function Sy is lA -convex. 

We sometimes say a function / is lA -convex on a set V C T n with the under- 
standing that V is an ZA -convex set and the extension of / to the whole space 
T n by defining f(x) = +oc for x 0 V is Z^-convex. It is also straightforward to 
show that an Lfi -convex function / restricted to an -convex set V by defining 
f{pc) = +oc for x 0 V is also Lfi -convex. From the definition of lA -convexity, it is 
clear that if / : 5ft n — )> 5ft is ZA -convex, then fz : Z n 5ft is also ZA -convex, where 
fz(x) = f(x) for any x E Z n . 

We have the following equivalent definition of ZA -convexity. 

Proposition 2.3.2 A function f : T n -A 5ft is L^-convex if and only if g(x,£) := 
f(x — £e) is submodular on (x, £) E J~ n x S , where S is the intersection of T and 
any unbounded interval in 5ft. 

Proof. Assume that / is ZA -convex. Consider any two vectors (x, £) E T n x S and 
(#',£') E T n x S. Without loss of generality, assume that £ > £'. Let a = £ — £'. 
We have that o E J-+ and 

g(x,£)+g(x',C) = f(x-£e) + f(x'-£'e) 

f((x — £e + ae) A (x f — £'e)) 

+/((z - fe) V (a/ - £'e - ae)) 
f(x Ax' — £'e) + f(x M x' — £e) 

0(a A a', £ A £') + #(x V x', £ V £'), 



> 
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where the inequality follows from the lA -convexity of / and the second equality 
from the definition of a. Thus, g is submodular. 

On the other hand, assume that g is submodular. For any x, x' £ T n and a £ F+, 
since S is the intersection of T and an unbounded interval in 5ft, it is clear that we 
can find a pair £ £ S and £' £ S such that £ = a + £' . We have that 

f(x)+f(x') = g(x+£e,£)+g(x'+t'e,t') 

> g((x+£e) A (x'+£'e), £ A f')+s(0iH-fe) V (x'+£'e), £ V £') 

= g{{x+£'e+ae) A (x'+£'e),£')+g((x+£e) V (x'+£e - ae), f) 

= f((x + ae) A x') + f(x V (x' — ae)), 

where the inequality follows from the submodularity of g. Hence, / is ZA -convex. I 

In the following we present some examples of -convex functions and -convex 
sets, whose proof is left as an exercise. 

Proposition 2.3.3 

(a) Given any univariate convex functions gi : 5ft — 5ft (i = 1, • • • ,n) and hij : 
5ft -A 5ft (i j), the function f : 5ft n -A 5ft defined by 

n 

f( x ) ■= Y 9i ^ + Y 

i=l i^j 

is r -convex. As a special case, any linear function is Lfi -convex. 

(b) A quadratic function f : 5ft n — >• 5ft defined by f{x) = Y^ij=i a ij x i x j with 
a i:j = aji £ 5ft is V -convex if and only if the matrix A with its ijth component 
being aij is a diagonally dominated M -matrix; that is, 

n 

aij < 0, V i j, an > 0, and aij > 0, V i. 

o = i 

(c) A twice continuously differentiable function f : 5ft n — >> 5ft is -convex if and 
only if its Hessian is a diagonally dominated M -matrix. 

(d) For a given vector a £ 5ft n and a nondecreasing univariate function f : 5ft -A 
5ft, the function g : 5ft n — >> 5ft defined by g{x) = /(maxi = i :n {ai + Xi}) is 
L* -convex. 

(e) A set with a representation {x £ T n : l < x < u,Xi — Xj < i 7 ^ j }, is 

ifi -convex in the space T n , where l,u £ T n and Vij £ J 7 (i 7 ^ j). In fact, any 
closed -convex set in the space T n can have such a representation. 

We now list below some useful preservation properties about Z^-convex func- 
tions. Some of these properties are similar to but more restrictive compared with 
those in Proposition 2.2.5, which deals with supermodular functions. 
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Proposition 2.3.4 

(a) Any nonnegative linear combination of -convex functions is -convex. 
That is, if fi : J 771 5ft (i = 1, 2, . . . , m) are -convex, then for any scalar 

> 0; Yl'iLi a ifi i s a ^ so -convex. 

(b) If fk is -convex for k = 1, 2, . . . and lim^oo fk{x) = f{x) for any x G T n , 
then f{pc) is -convex. 

(c) Assume that a function /(•,•) is defined on the product space J 771 x J 7771 . If 
f( m ,y) is L^-convex for any given y G J 7771 , then for a random vector ( in 
J 7171 , E^[f(xX)] is L^-convex, provided it is well defined. 

(d) If f : J 771 5ft is an -convex function, then g : J 771 x J g S defined by 
g(x,£) = f{pc — £ e ) is also L^-convex. 

(e) Assume that A is an L^-convex set of J 717 x J 7171 and /(•, •) : J 771 x J 7771 5ft 
is an i} -convex function. Then the function 

g(x) = inf y , {Xty)eA f(x,y) 
is r -convex over J 771 if g(x) ^ — oo for any x G J 771 . 

Proof. Parts (a), (b), and (c) are straightforward; thus, we only prove parts (d) 
and (e). We first prove part (d). To show that g(x,£) is Z^-convex on J 771 x J 7 , 
notice that for any (#,£,£) G J 771 x J 7 x J 7 , g(x — £e,£ — £) = g(x,£), which 
is independent of ( and submodular in (#,£). Therefore, from Proposition 2.3.2, 
9(x, 0 is U -convex. 

To prove part (e), we assume without loss of generality that A = J 771 x J 7771 ', 
otherwise, we can focus on the restriction of / on A. From the definition of g , we 
have 

g(x-£e)= inf f(x — £e,y) *= inf f(x-£e,y-£e). 

y^T m y^T m 

Since / is L^-convex, from Proposition 2.3.2, the function f(x — £,e,y — £e) is 
submodular in ( x , y , £) G J 771 x J 7771 x J 7 . The preservation of the supermodularity 
property in Proposition 2.2.5 part (e) then implies that g{pc — £e) is submodular 
for (x,£) G J 771 x J 7 , and thus from Proposition 2.3.2, g is L ^ -convex. I 

Since an lA -convex function is submodular, we can prove monotonicity proper- 
ties of optimal solutions similar to the ones in Theorem 2.2.8. In a simple setting 
presented below, one can also show that the optimal solutions have bounded sensi- 
tivities. The result was first established by Zipkin (2008) to analyze the structural 
properties of stochastic inventory models with lost sales. 

Lemma 2.3.5 Let g(x,£) : J 771 x J 7 5ft be -convex, and let £(x) be the 
largest optimal solution (assuming existence) of the optimization problem f{x) = 
min^ e jr g(x, £) for any x G dom(f). Then £(x) is nondecreasing in x G dom(f), 
but £(x + eve) < £(x) + c u for any uj > 0 with uj G J 7 and £(x) + cv G dom(f). 
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Proof. The statement that £(x) is nondecreasing in x G dom (/) follows directly 
from Theorem 2.2.8. We now prove that £(# + uje) < £{x) + uj. Define 

9 e {x, f , u) = g(x -ue,f-w). 

The lAconvexity of function g(x, £), together with Proposition 2.3.2, implies the 
submodularity of function g e ix,f ) ,uj). For any given £' G 5ft with £' > £(x) + uj, 
consider two vectors (x, i' —uj, —uj) and {x, £{x), 0). Since uj > 0 and £' — uj > £(x), 

(x, — uj, —uj) A (x,£(x), 0 ) = (x,£(x), —uj), ( x , — uj, —uj) V (x, £(x), 0 ) = (x, — uj, 0 ), 

and we have that for any £' > £(x) + uj, 

g(x + we,€')-g(x + we,€(x)+w) = g e (x, £' - co, -uj) - g e (x, £(x), -u) 

> g e (x,£' -ui,0) - g e (x,£(x),0) 

= 9 (x,t' ~u) ~ g(x,£(x)) 

> 0 , 

where the first inequality follows from the submodularity of g e (x,£,uj) and the 
last inequality from the assumption of £{x) and The above inequalities imply 
that £' cannot be optimal for the optimization problem min^ e jr g{x + uje, £) for 
any £' > £(x) +w. I 

2.3.2 ilA -Convexity 

We now introduce the M^-convexity. For a given x £ T" , define its positive support 
set 

supp + (x) = { i | Xi > 0 }. 

Let ei G T n be the unit vector with 1 at its ith component for i = 1, 2, . . . , n and 
eo be the vector zero. 

Definition 2.3.6 (M^-Convexity) A function f : T n 5ft is -convex if the 
exchange property holds ; that is, for any x,x' G dom{f), i G supp + {x — x'), there 
exist an index j G supp + {x r — x) U {0} and a positive number ol$ G T such that 

f(x) + f{x') > f{x - a{ei - ej)) + fix' + a(ei - ej)) (2.15) 

for any a G T with 0 < a < a^. A function f is -concave if —f is -convex. 

Similar to lA -convexity, a set V G J~ n is called an -convex set if its indicator 
function S\; is -convex. It is clear that the effective domain of an -convex 
function is M^-convex. We can also prove that M^-convexity is preserved when 
taking the limit of a sequence of -convex functions. 

Unfortunately, unlike lA -convexity, M ^- convexity is less amicable for analysis. 
In fact, an -convex function restricted to an -convex set is not necessarily 
M^-convex, and neither is the sum of two M^-convex functions. 

A subclass of -convex functions, relatively easier to deal with, is the so- 
called laminar convex functions, defined as the sum of univariate convex functions 
indexed by a laminar family. 
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Definition 2.3.7 A nonempty set C consisting of subsets of { 1 , 2, . . . , n} is called 
a laminar family if for any X,Y e C, XnY = 0 or X CY or X ZY. A function 
f : T n — » 5ft is a laminar convex function if it can be represented as 

/o) = Y fs(Y Xi )> 

sec ies 

where fs are univariate convex functions and C is a laminar family. 

We now present a few facts about -convex functions and laminar convex 
functions, whose proof is left as an exercise. 

Theorem 2.3.8 

(a) A laminar convex function f : T n is -convex. 

(b) Any separable convex function, defined as the sum of univariate convex func- 
tions, is laminar convex and thus -convex. As a special case, any linear 
function is laminar convex and -convex. 

(c) The intersection of the sets of -convex and M^- convex functions is exactly 
the set of separable convex functions. 

(d) A quadratic function f : 5ft n — » 5ft defined by f{x) = a ij x i x j w ^h 

dij = aji £ 5ft is -convex in Z n if and only if it is laminar convex, which 
is also equivalent to the following condition: 

dij > 0,V i, j = 1, . . . ,n, and a i;j > min {a ik ,a jk } for k £ {i,j}. (2.16) 

(e) A quadratic function f : 5ft n 5ft defined by f{pc) = Y^ij=i a ij x i x j w ^h 

a ij = aji £ 5ft is -convex in 5ft n if and only if for any 7 > 0 , A + 7 / is 

nonsingular and (Zl + yZ ) -1 is a diagonally dominated M -matrix, where A 
is a matrix with a ij at its ijth component. 

(f) The infimal convolution of two M^- convex functions f\,f<i • J~ n 5ft, defined 
as 

g{x)= min (fi(u) + f 2 (v)), 

n ,U+V=X 

remains -convex in T n if g{pc) > —00 for any x £ T n . 

Z^-convex functions and M^-convex functions enjoy nice properties for optimiza- 
tion. For example, as we present below, these functions defined on Z n are convex 
extendable (its proof is not trivial; readers are referred to Murota 2003), and simi- 
lar to Theorem 2.1.12, a local minimizer of an Z^-convex function or an M^-convex 
function is guaranteed to be a global minimizer. We refer to Murota (2003) for 
efficient algorithms for the minimization of Z^-convex functions and M^-convex 
functions and elegant duality properties between the two classes of functions. 
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Theorem 2.3.9 Lfi -convex functions and -convex functions defined on Z n are 
convex extendable. That is, for any Lfi -convex or -convex function f : Z n — >• 
there exists a convex function f e : 5R n 5ft such that f e (x) = f{pc) for any x G Z n . 

Theorem 2.3.10 

(a) Assume that f : Z n — >> 5ft is -convex. A vector x* G dom(f) is a global 
minimizer of f if and only if it is a local minimizer in the sense that 

f(x*) < mm{f(x* + v), f(x* - v)}, for any v £ {0, 1}". 

(b) Assume that f : Z n — y SR is -convex. A vector x* £ dom(f) is a global 
minimizer of f if and only if it is a local minimizer in the sense that 

f(x*) <min{/(x* ‘ I ('./)}. for anyi,j £ {0,1,..., n}. 

Proof. The “if” direction in both parts (a) and (b) is straightforward. We now 
prove the “only if” direction of part (a) by induction on \\x — #*||i, where for any 
given x G 5ft n , ||#||i = Y^=i \ x i \ • For an y x ^ with \\x — x*\\i = 1, we have from 
the definition of local minimizers that /(#*) < f{x) since x = x* + e^ or x = — ei 

for some i. Assume that f(x*) < f(x) for any x G Z n with \\x — x*\\i < k. Consider 
any x G Z n with \\x — x*\\i = k + 1. If x* and x are unordered, it is easy to verify 
that 

\\x A #* — a;* ||i < \\x — x*\\i and \\x V — x*\\i < \\x — #*||i, 

and by the induction assumption, f{x Ax*) > f(x*) and f(xVx*) > /(#*), which 
together with the definition of -convexity imply that 

f(x) > f(x A #*) + f(x V x*) - /(#*) > /(#*). 

If and x are ordered, without loss of generality assume that x < x* . It is clear 
that x < (x + e) A x* < x* and x<xV (x* — e) < x*. If x = x V (x* — e), we 
immediately have from the definition of local minimizers that /(#*) < /(#). If 
x x V (x* — e), we have that 

||(# + e) A — x*\\i < \\x — x* ||i and ||# V (x* — e) — #*||i < ||# — #*||i, 

and by the induction assumption, 

/((# + e) A x*) > /(#*) and f{x V (x* - e)) > /(#*). 

Again from the definition of -convexity, we have that 

/0) > f((x + e) A #*) + f{x V (x* - e)) - /(#*) > /(#*). 

We now prove the “only if” direction of part (b) by induction on ||# — #* ||i. For 
any x G Z n with ||# — x* ||i = 1, it is clear from the definition of local minimizers 
that /(#*) < f(x) since x = x* + e* or x = — e* for some i. Assume that 
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f(x*) < f{x) for any x G Z n with \\x — x*\\i < k. Consider any x G Z n with 
\\x — x*\\i = k + 1. We have that there exist some i G supp(x — x*) U {0} and 
j G supp(x* — x) U {0} with i ^ j such that \\x — (e$ — ej) — x* ||i < \\x — x* ||i, and 
thus, 

f(x) > f(x - (e* - ej)) + /( x* + (e* - ej)) - /(x*) > /(x*), 

where the first inequality follows from the definition of M^-convexity and the 
second inequality from the induction assumption and the definition of local mini- 
mizers. I 
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Exercise 2.1. Assume that a function / : 5ft n — >> 5ft is continuous. 

(а) Prove that / is convex if and only if it satisfies the midpoint convexity ; 
namely, for any x,x' G 5ft n , 

/ (—If—) ^ ^(/<» + 

(б) Prove that / is convex if and only if it satisfies the equidistance convexity ; 
that is, for any x, x' G 5ft n and a G [0, 1], 

f(x) + /O') > f{x - a(x - x')) + f{x' + a(z - a;')). 

Exercise 2.2. (Private Communication with Peng Sun) Assume that / : 5ft -E 5ft is 
convex, and random variables Xi, X 2 , . . . , are nonnegative and independently and 
identically distributed. Prove that E[/(^™ =1 X$)] is convex on the set of natural 
numbers. 

Exercise 2.3. Prove Theorem 2.2.4. 

Exercise 2.4. Assume that a function /(•,•) is defined on the product space 
5 ftn x an q f(-,y) is convex for any given y G 5ft m . Let £ be a random vector in 
5 ftm. p rove the following: 

(a) The exponential function exp(x) is strictly increasing and convex. 

( b ) The function E^[exp(w + f(x, £))] is jointly convex in x and w. 

(c) The function In (E^[exp(/(x, £))]) is convex. 

Exercise 2.5. If / : 5ft n x 5ft m — ^ 5ft is quasiconvex, show that g(x) = min y /(x, y) 
is also quasiconvex provided that g is well defined. 
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Exercise 2.6. Assume that A is a lattice in the product space 5ft n x 5ft m . Prove 
that the set function S(t) = {x G 5ft n |(x,£) €= A} is increasing on the set {t G 
5R™|S(t)^0}. 



Exercise 2.7. Let Z be the set of all integers in 5ft. Prove that a function / is 
convex on Z if and only if either of the following two conditions holds: 

(a) A f(pc) is nondecreasing, where A f(pc) = f(x + 1) — f{x). 

( b ) There exists a convex function g on 5ft such that g(x) = f(x) for all xGZ. 
In other words, g is a convex extension of /. 



Exercise 2.8. Prove that a function / : Z n — > 5R is -convex if and only if for 
any x, x' G Z n , the following discrete midpoint convexity holds: 



f{x) + f{x') > / 



x + x' 
2 



+ / 



/ x + x' 

V “2“ 



(2.17) 



where for any x G 5ft n , is an n-dimensional integer vector derived by rounding 
down each component of x to its nearest integer, and \x] is an n-dimensional 
integer vector derived by rounding up each component of x to its nearest integer. 



Exercise 2.9. Prove Theorem 2.3.3. 



Exercise 2.10. Prove that the set of minimizers of an L^-convex (M^-convex) 
function is Z^-convex (M^-convex). 

Exercise 2.11. Prove that any -convex (M^ -convex) set S C Z n is hole-free; 
that is, S is exactly the intersection of the convex hull of S and Z n . 



Exercise 2.12. Prove Theorem 2.3.8. 



Exercise 2.13. Prove the following statements by providing counterexamples: 
(a) a quadratic -convex function in 5ft n may not be laminar convex; ( b ) the 
condition in (2.16) is sufficient but not necessary for a quadratic function to be 
-convex in 5ft n . 

Exercise 2.14. Prove or disprove the following statement: The infimal convolu- 
tion of two laminar convex functions defined on the same laminar family remains 
laminar convex. 



Exercise 2.15. Prove Theorem 2.3.9. 

Exercise 2.16. The definitions of Z^-convex functions and ilZ^-convex functions 
on 5ft n are different from the original ones in Murota and Shioura (2004), which 
explicitly impose the convexity condition. You are asked to show that they are 
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equivalent under the continuity assumption; that is, continuous Z^-convex func- 
tions and -convex functions defined on 5ft n are convex. Is the continuity as- 
sumption dispensable? 

Exercise 2.17. Prove that an -convex function / : Z n — )> 5ft is supermodular 
in Z n . 

Exercise 2.18. (Chen et al. 2012b) Consider the following optimization problem 
parameterized by two-dimensional vectors x £ S = {Ay : y E D}: 

f(x) = max g(y) 

s.t. Ay = x, 
yeD, 

where A is a nonnegative 2 x n matrix, D is a closed convex lattice of 5ft n , and 
g : D — » !ft. Show that if g is concave and supermodular on D , then so is / on S. 

Exercise 2.19. (Hu 2011) For i = 1,2 , ...,m, let Di be a subset of 5ft 2 and 
^ : Di 5 ft. Define 



/0*0 = max 

s-t. YJiLl x i = x , 

x* E Di V i = 1, 2, . . . , m. 



Show that if Di are -convex sets and gi are -concave functions, then / is 
-concave. 



3 

Game Theory 



Game theory provides a powerful mathematical framework for modeling and ana- 
lyzing systems with multiple decision makers, referred to as players, with possibly 
conflicting objectives. A game studied in game theory consists of a set of players, 
a set of strategies (or moves) available to the players, and their payoffs (or util- 
ities) for each combination of their strategies. Depending on whether the players 
can sign enforceable binding agreements, game theory consists of two branches: 
noncooperative game theory and cooperative game theory. Noncooperative game 
theory provides concepts and tools to study the behaviors of the players when they 
make their decisions independently. Cooperative game theory, on the other hand, 
assumes that it is possible for the players to sign enforceable binding agreements 
and provides concepts describing basic principles these binding agreements should 
follow. Both noncooperative game theory and cooperative game theory have been 
widely used in many disciplines, such as economics, political science, social science, 
as well as biology and computer science, among others. They have also received 
considerable attention in supply chain management literature in recent years. In 
this chapter, we provide a concise introduction to some of the key concepts and 
results that are most relevant in our context. We refer to Osborne (2003) and 
Myerson (1997) for both noncooperative game theory and cooperative game the- 
ory, Fudenberg and Tirole (1991) and Basar and Olsder (1999) for noncooperative 
game theory, Vives (2000) on oligopoly pricing from the perspective of noncoop- 
erative game theory, and Peleg and Sudholter (2007) for cooperative game theory, 
respectively. 
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3.1 Noncooperative Game Theory 

Noncooperative games can be represented in both extensive form and 
normal form. A game in extensive form specifies the sequence of moves of the 
players and the associated information structure based on which the players decide 
their moves, while a game in normal form is a simultaneous move game in which 
the players move only once and at the same time. Whether to represent a game in 
either extensive form or normal form depends on which form is more convenient for 
a specific application. For example, a Stackelberg game in which one player, ref- 
erred to as the leader, makes its move first followed by the moves of other players, 
referred to as the followers, after observing the leader’s move, is naturally described 
in extensive form. However, mathematically, any extensive form representation of 
a game can be equivalently translated into a game in normal form, and a game in 
normal form can be treated as an extensive form game with simultaneous moves. 
For our purpose, we will restrict our attention to games in normal form here. 

A game in normal form is specified by a triple (TV, {iq}* G at), where TV = 

{1, 2 . . . , n} is the set of players, Si is the set of strategies of player i, iq : S -T 5ft is 
player i’s payoff as a function of the composition of all players’ strategies, referred 
to as a strategy profile, and S = n ieNSi is the set of all strategy profiles. All these 
elements are common knowledge to the players. That is, the players know these 
elements, they know the other players know these elements, they know the other 
players know the other players know these elements, and so on. 

A simple example of a noncooperative game is the famous prisoner’s dilemma. 
In this game, two criminals, referred to as Players 1 and 2, are arrested and im- 
prisoned. They have to decide independently whether to betray the other or keep 
silent. If both criminals keep silent, each of them will serve one year in prison; if 
both betray, each serves two years; if one betrays and the other keeps silent, the 
one who betrays is set free and the other will serve three years. 

The normal form of the prisoner’s dilemma is specified by (AT, {Si}i e N, 
{ Ui}i e N ), where TV = {1,2}, S± = S 2 = {Silent, Betray}, and the payoff of each 
player is the negative of the number of years served in prison by the player. The 
payoffs of the game can be described in Table 3.1. In the table, the first column 
specifies the strategies of Player 1 and the first row specifies the strategies of Player 
2. In each cell, the first number is the payoff of Player 1 and the second number 
is the payoff of Player 2, given their corresponding strategies. 





Silent 


Betray 


Silent 


- 1 , 


-1 


- 3 , 


0 


Betray 


0 , 


-3 


- 2 , 


-2 



TABLE 3.1. The prisoner’s dilemma 
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3.1.1 Definition and Existence of Nash Equilibrium 

Given the specification of a game, one is interested in predicating its outcome — 
which strategies the players will take and what payoff each of them will receive. 
In this respect, the Nash equilibrium is one of the most important concepts in 
noncooperative game theory. 

Definition 3.1.1 Given a game (TV, {Si}i e N, {ui}ieN), a strategy profile 8* = 
(s^, . . . , 8* ) e S is a Nash equilibrium (in pure strategies) if for any i G N, 

ufis *) > ufisi, sN) V Si G Si , 

where for a given s G S, S-i G II j^iSj is the strategy profile of all players except 
player i. 

The above definition says that in a Nash equilibrium, any player’s strategy 
maximizes its own payoff assuming that the other players’ strategies are specified 
in the Nash equilibrium; that is, for any i G AT, 

s* e argmax s . eS .ti i (s i ,s!L i ). 

Clearly, a Nash equilibrium specifies basic requirements rational players would 
respect and is often used as an appropriate predication of their behaviors. In fact, 
when playing the game, a player may conjecture the strategies of other players. 
In an equilibrium, the conjectures of the players are correct and the players, if 
rational, would respectively maximize their payoffs given their conjectures. 

Consider the prisoner’s dilemma. For Player 1, it is clear that its best strategy 
is to betray Player 2 no matter what strategy Player 2 takes. Similarly, Player 2’s 
best strategy is to betray Player 1. Thus, the strategy profile (Betray, Betray) is 
a Nash equilibrium. It is interesting to observe that both players can be better 
off by keeping silent. However, the strategy profile (Silent, Silent) is not a Nash 
equilibrium. 

Despite its conceptual appealing, the concept of Nash equilibrium faces several 
theoretical and practical difficulties. One of the main difficulties is that a game 
may not admit any Nash equilibrium, as demonstrated in the following matching 
penny game. In this game, two players, Player 1 and Player 2, each of whom has a 
penny, need to decide independently which side to show. If the sides of the pennies 
match, namely, both players show either heads or tails, Player 1 collects both 
pennies and thus has a gain of one penny; otherwise, Player 2 collects both pennies 
and has a gain of one penny. Formally, we face a game (AT, { Ui}i e N ), with 

N = {1,2}, Si = S 2 = {Head, Tail}, and (^ 1 ,^ 2 ) described in Table 3.2. 



Head 

Tail 



Head Tail 



+ 1 , —1 


— 1 , +1 


— 1 , +1 


+ 1 , —1 



TABLE 3.2. Payoffs of matching penny game 
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The matching penny game is a zero-sum game in which the total payoff of the 
two players is zero for any strategy profile. It does not have a Nash equilibrium. To 
see this, notice that for any strategy chosen by Player 1, Player 2’s best response 
is to pick a different strategy, while for any strategy chosen by Player 2, Player 
l’s best response is to follow the same strategy. Thus, any strategy profile is not 
a Nash equilibrium. 

We first present a set of conditions under which the existence of a Nash equi- 
librium is guaranteed. 

Theorem 3.1.2 Given a game (TV, {Si}i e N, f or an y player i G N, 

its strategy set Si is a nonempty, convex, and compact set in a Euclidean space, 
its payoff function Ui is continuous in s G S and quasiconcave in s^ G Si, then the 
game has a Nash equilibrium. If, in addition, the game is symmetric, namely, for 
any i,j G N , Si = Sj, and Ui(si , S-i ) = Uj{s '- , s'_-) if Si = s f - and S-i and s'_- are 
the same when ignoring the identities of the players, there is a symmetric Nash 
equilibrium. 

The proof is an application of the well-known Kakutani fixed-point theorem, 
which we state in the following without proof. 

Theorem 3.1.3 Let C be a nonempty, compact, and convex subset o/5P n . Let T 
be a set-valued operator that maps any x G C to a nonempty and convex subset of 
C. If the set {( x,y ) : x G C, y G T(x)}, referred to the graph of T , is closed, then 
T has a fixed point. That is, there exists an x G C such that x G T(x) . 

To gain some intuition, let’s look at a simple case with a function / : [0, 1] — >> 
[0, 1]. If / is continuous, then one can see directly from a graph that the curve 
y = f(x) must intersect the line y — x at some x G [0, 1]; that is, a fixed point 
exists. 

Theorem 3.1.2 is an immediate consequence of the Kakutani fixed-point theo- 
rem. To see this, denote for any s G S, Ti(s) = argmax^ eS .Ui(sfs-i), the best 
response of player i given the strategies of the other players, and note that a Nash 
equilibrium is exactly a fixed point of the best response operator T = (Xi, . . . , T n ), 
which maps a strategy profile in S' to a subset of S. The continuity of iq together 
with the compactness of Si implies that T has a closed graph, and the quasicon- 
cavity of each player’s payoff in its own strategy together with the convexity of the 
strategy set implies that T(x) is convex for any x G S. Thus, T has a fixed point 
and the game has a Nash equilibrium. To show that a symmetric game admits a 
symmetric equilibrium, denote for any s G Si, T(s) = argmax^^.iq^, s, . . . , s). 

Again, the conditions in Theorem 3.1.2 imply that T has a fixed point s*, and 
therefore the symmetric strategy profile (s*, s *, . . . , s*) is a Nash equilibrium. We 
emphasize that s* is a fixed point of T. In most cases, it is not a global maximizer 
of u\(s, s, . . . , s). N common mistake we have seen is to claim that if (s°, s°, . . . , 5°) 
is a symmetric equilibrium, then s 0 is a global maximizer of iq(s, s, . . . , s), or vice 



versa. 
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A different set of conditions to guarantee the existence of a Nash equilibrium 
without quasiconcavity of the payoff functions is based on supermodularity. Specif- 
ically, a game (TV, {Si}i£N, {^}ieiv) is called (strictly) supermodular if, for each 
i E TV, Si is a lattice, and iq is (strictly) supermodular in Si E Si for any fixed 
S-i E II j^LiSj and has (strictly) increasing differences in (si,S-i) on Si x n j^iSj. 
For simplicity, we restrict our discussion to cases in which Si is a lattice in a 
Euclidean space. We have the following theorem about a Nash equilibrium for 
supermodular games. 

Theorem 3.1.4 Given a supermodular game (TV, {rq}ie;v), f or i £ TV, 

Si is nonempty and compact, Ui is continuous in Si E Si for any fixed S-i E 
II jz/ziSj, then the set of Nash equilibria is nonempty and has a largest and a smallest 
element. If, in addition, the payoff functions Ui are parameterized by t E T C 
and have increasing differences in ( Si,t ) for any fixed E II j^iSj, the largest 
and the smallest Nash equilibria are increasing in t. 

Proof. We prove this theorem using the generalized Tarski fixed-point theorem, 
Theorem 2.2.9. Again, for any s E S, let T^(s) = argmax s / e5 . 

Ui(s[, s-i) and T = (Ti, . . . , T n ). Our assumption on Ui{si, S-f) together with The- 
orem 2.2.8 implies that Ti(s ), as a subset of Si , is a nonempty and compact lattice 
for any s E S and Ti is increasing. Thus, T(s), as a subset of S, is a nonempty and 
compact lattice for any s E S and T is increasing. From Theorem 2.2.9, the set of 
fixed points of T, or equivalently the set of Nash equilibria, is nonempty and has 
a largest and a smallest element. 

If, in addition, the payoff functions rq (i E TV) are parameterized by t E T C 
and have increasing differences in (si,t) for any fixed S-i E II j^iSj, from 
Theorem 2.2.8, the set T*(s , t) = argma x s / G sTq(s£, s_i, t) is increasing in t as well. 
The remaining argument is similar to the one in the previous paragraph using the 
second part of Theorem 2.2.9. I 

One remedy for games without Nash equilibria in pure strategies is to allow 
mixed strategies. Specifically, a mixed strategy for a player i with a strategy set 
Si is a probability distribution defined over Si. Let be the set of all mixed 
strategies of player i and E = Si x . . . x E n . We now define a Nash equilibrium 
in mixed strategies. 

Definition 3.1.5 Given a game (TV, {iq}*eiv); a strategy profile cr* = 

(a *, . . . , <t*) eE is a Nash equilibrium in mixed strategies if, for any i E TV, 

'^i(c r *) > Ui(si,aff) V Si E Si, 

where, for a given a, o-i is the strategy profile of all players except player i and 
Ui(a) is the expected payoff. 

Unlike the case with a Nash equilibrium in pure strategies, one can show using 
Theorem 3.1.2 that a Nash equilibrium in mixed strategies always exists for games 
with finite (pure) strategies. Indeed, using mixed strategies can be regarded as a 
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way to convexity the payoffs and the strategy sets of a game. Come back to the 
matching penny game. We now show it has a unique Nash equilibrium in mixed 
strategies. For this purpose, let o\ = £ Xu = {(^1,^2) : xi + £2 = l,£i > 

0 , X2 > 0 }, which specifies a mixed strategy of choosing “Head” and “Tail” with 
probabilities x\ and £2, respectively. Similarly, let a 2 = (271,2/2) £ XU2 (= Xi). We 
have ui ( a ) = (x\ — £2X2/1 — 2/2) and ^2(^2) = — tq(cr). The definition of a Nash 
equilibrium in mixed strategies implies that 

«i (0 = K -^2X2/1 - 2/2) > ((I; 0 ), (jJ) = v{ -y* 2 , 

ui(a*) = (x\ ){yl -y* 2 ) > wi((0, 1), e^) = “G/i — 2 / 1 ) • 



Thus, 

\y*l - 2/2*1 < «lK) < to - 2^2 1 bl - 2/2*1 < I2/1 - 2/2*1- 

If 2/1 2/2 5 we m ust have |£* — £2 1 = lj that is, cr* must be a pure strategy. 

Symmetrically, if 7^ £2 5 a 2 mus t be a pure strategy. However, as demonstrated 
earlier, the matching penny game does not have a Nash equilibrium in pure strate- 
gies. Thus, we must have crjf = cr| = ( 1 / 2 , 1 / 2 ). That is, both players choose their 
pure strategies with equal probability, resulting in zero expected payoffs for each 
of the players. 

Though the existence of a Nash equilibrium in mixed strategies can be guaran- 
teed even when a Nash equilibrium in pure strategies does not exist, it is not clear 
how mixed strategies would be implemented in supply chain management settings. 
Thus, we mainly focus on a Nash equilibrium in pure strategies and simply refer 
to it as a Nash equilibrium when there is no confusion. 



3.1.2 Uniqueness of Nash Equilibrium 

Another difficulty with the concept of Nash equilibrium is that a game may have 
multiple equilibria. In this case, it often depends on factors that are not included 
in the formal description of the game to determine which equilibrium would be 
an appropriate outcome. Consider a game in which two friends, again referred to 
as Player 1 and Player 2 , decide independently between two locations, A and B, 
for lunch. If it happens that they pick the same location, they enjoy each other’s 
company and get a payoff of 1 each; otherwise, they eat alone and get a payoff 
of 0 (Table 3 . 3 ). This is a coordination game without communication. Similar to 
the matching penny game, we can present the payoffs of the players in a table 
format. One can easily check that the game has two Nash equilibria, (A, A) and 
(B, B). From the description of the game, it is not clear how one player can make 
a right guess of the move of the other player and whether any equilibrium could 
be realized. 

Thus, in many applications, it is desirable to establish the uniqueness of a Nash 
equilibrium. There are three basic approaches for this purpose, which are termed 
the “contraction,” “univalence,” and “index theory” approaches, respectively. 



3.1 Noncooperative Game Theory 51 



A 
B 

TABLE 3.3. Payoffs of coordination game without communication 



A B 



+ 1 , +1 


- 1 , -1 


- 1 , -1 


+ 1 , +1 



The first approach is based on the contraction principle, which says that a 
contraction operator in a Euclidean space (or, more generally, complete space) 
has a unique fixed point. Specifically, given a subset C of a Euclidean space, an 
operator T : C C is a contraction if there exists a constant a with 0 < a < 1 
such that for any x, x' G C, || T{pc) — T(x') || < a\\x — x'\\, where || • || is a norm (for 
example, the commonly used 1-norm, 2-norm, or oo-norm). If C is closed, then T 
has a unique fixed point in C . If, in addition, C is compact, it suffices to require 
|| T(x) — T(x f ) || < \\x — x'\\ for any x, x' G C. A sufficient condition for the best 
response operator T to be a contraction is that the oo-norm of the Jacobian of T 
is less than 1. 

Under the assumptions in Theorem 3.1.2, if one can show that the best re- 
sponse operator is single- valued and a contraction, then we have a unique Nash 
equilibrium. One sufficient condition for the case in which the strategy sets Si are 
one-dimensional is the diagonal dominance of the second-order derivatives of the 
payoffs 



d 2 Ui(s ) 

d 2 Si 



+ £ 



dhiijs) 

dsidsj 



< 0 ,Vs e S,i e N, 



(3.1) 



which implies that the oc-norm of the Jacobian of T is less than 1. To see this, 
observe that if all equilibria are interior points of the strategy sets and the payoff 
functions are differentiable, we have the first-order optimality condition 
Vui(Sj, s-i) \ s '.=Ti( s ) = 0 for i G N, where Viiii(s) is the gradient of Ui with 
respect to s*. From the implicit function theorem, dT Q^ = 0 and for j i, 



dTjjs) _ / 9 2 «i(s(,s-i) \ 1 dV(s',s_j) 

dsj \ d 2 Si ) dsidsj s i~ Ti ( s '>' 



The oc-norm of the Jacobian of T is given by 



max 

ieN 



£ 

jeN 



dTjjs) 

dsj 



Similar conditions can be derived when the strategy sets Si are multidimensional. 

The second approach is based on analyzing the first-order optimality conditions 
of all players’ maximization problems. A key condition for the uniqueness of a 
Nash equilibrium is the diagonally strict concavity of the payoff functions. Define 
Vu(s) = [Viiti(s), . . . , V n u n (s)] T . 



Definition 3.1.6 The payoff functions {uffi^N are diagonally strictly concave on 
S if for any 5, s' G S, 



js - s'fjVujs) - Vu(s')) < 0. 
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When the strategy sets Si are one-dimensional, the condition (3.1) implies that 
the payoff functions are diagonally strictly concave. 

Denote G(s) the Jacobian of Vu(s). It is straightforward to show that if G(s) + 
G(s) T is negative definite for any sgS, then the payoff functions are diagonally 
strictly concave. Some additional technical conditions are also needed. Specifically, 
we assume that the strategy sets have explicit algebraic representations: 

Si = {Si G 5R ni |%(si) > 0 ,j = 1 (3.2) 

where rii and Ti are positive integers and hij : 5R ni — >> SR. 

The following uniqueness result is due to Rosen (1965). 

Theorem 3.1.7 Consider a game (TV, Assume that Si is com- 

pact and has the representation (3.2), where h^ is concave, and there exists some 
x ? G 5R n * such that hij(pc ^) > 0 for any nonlinear function hij. If the payoff 
functions are diagonally strictly concave on S, then the game has a unique Nash 
equilibrium. 

The existence of a Nash equilibrium follows from Theorem 3.1.2. The conditions 
on hij are imposed so that we can write down the first-order optimality conditions, 
commonly referred to as the Karush-Kuhn-Tucker (KKT) conditions, and the 
diagonally strict concavity of the payoff functions is sufficient for the uniqueness 
of a solution to the KKT conditions. 

The third approach is based on the Poincare-Hopf index theorem. Consider a 
function g : C -> 5R m , where C is a nonempty compact set in 5R m . An implication 
of the Poincare-Hopf index theorem is that under some boundary condition on C, 
g(a) =0 has a unique solution if the determinant of —G(a) is positive whenever 
g(a) = 0, where G(a) is the Jacobian of g. To gain some intuition, consider a 
differentiable function g : [0, 1] SR with g(0) > 0 and g( 1) < 0. Clearly, g(a) =0 
has at least one solution. It cannot have multiple solutions. To see this, let a* be 
any solution of g(a) = 0; then for a > a*, g(a) is less than zero until it reaches 
zero for the first time at some a' . However, it is impossible since g has a negative 
derivative at a! . 

If all equilibria are interior points of the strategy sets and the payoff functions are 
differentiable, then any equilibrium s* would be a solution of the equation system 
Vu(s) = 0. The index theory thus provides an approach to check the uniqueness 
of a Nash equilibrium. That is, if the determinant of the Jacobian of —Vu(s) is 
positive at the equilibria, then we have a unique Nash equilibrium. This approach 
is less restrictive but harder to check than the other two approaches. 



3.2 Cooperative Game Theory 

Cooperative games can be represented in either coalitional form or strategic form. 
Cooperative strategic games assume that the players can make binding agreements 
on the choice of strategies, while cooperative coalitional games assume that they 
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can make binding agreements on the distribution of payoffs. Since many applica- 
tions can be naturally formulated in coalitional form and the players focus more 
on the choice of stable payoffs rather than on the choice of stable strategies, as 
any combination of strategies can be supported by a binding agreement, we will 
only consider cooperative games in coalitional form. 

Cooperative coalitional games can be divided into two categories: transferrable 
utilities and nontransferrable utilities. A coalitional game with transferrable utili- 
ties is a pair (AT, V), where N = {1, 2, . . . , n} is the set of all players and referred to 
as the grand coalition, and V is the characteristic function that maps any subset S 
of AT, called a coalition, to a real number with E(0) = 0. If a coalition S forms, its 
value V (S) can be allocated in any possible way among its members by allowing 
side payments. That is, a feasible payoff vector x of the players in S satisfies the 
condition x(S) < V(S), where we use the notation x(S) to denote J2 ieS and Xi 
is the payoff to player i. A coalitional game with nontransferrable utilities is a pair 
(AT, E), where again AT = {1,2, . . . , n} is the set of all players, and V is a mapping 
that associates a set in a Euclidean space with each coalition S. For simplicity, we 
focus on coalitional games with transferrable utilities and we simply refer to them 
as coalitional games or cooperative games in coalition form. 

Depending on the properties of the characteristic functions, we can define mono- 
tonic, superadditive, and convex games. Specifically, a coalition game (AT, V) is 
called monotonic if 

V(S) < E(T),V S C T C AT; 

that is, the larger a coalition in terms of set inclusion, the higher the worth of the 
coalition. A coalition game (AT, V) is called superadditive if 

V(S U T)> V(S ) + E (T), V S, T C AT, S n T = 0; 

namely, combining two independent coalitions leads to a higher value. A coalition 
game (AT, V) is called convex if 

V(S) + V(T) < V{S U T) + V(S D T), V S, T C AT. 

A convex game is also referred to as a supermodular game since v is supermodular 
as a set function. One can show that a game (AT, V) is convex if and only if 

V(S U {i}) - V(S) < V(T U {i}) - V(T), V S' C T C AT, i £ T; (3.3) 

that is, the marginal contribution of a player is increasing with respect to set 
inclusion. Since E(0) = 0, a convex game is superadditive. 

Given a cooperative game (AT, V), one would like to know how the coalition 
value should be distributed among the players. We will introduce several solution 
concepts: the core, the nucleolus, and the Shapley value. A solution concept spec- 
ifies for each game a set of payoff vectors in 5R n whose ith components represent 
the values allocated to player i, under the assumption that the grand coalition 
forms. If the players do not form the grand coalition, we can restrict these solution 
concepts to whatever coalitions the players form. In cooperative game theory, most 
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solution concepts can be derived from a list of axioms, which stipulate properties 
a reasonable allocation should respect. We list some of the properties desirable for 
a solution concept cr(AT, E). 

• Efficiency: Any payoff vector x G cr(AT, V ) is efficient; namely, x(N) = E(AT), 
or, equivalently, the grand coalition value E ( AT ) is fully allocated, where for 
any set S C AT and any vector z G 5ft n , z(S) = ^2 ie s z i- 

• Individual rationality: Any payoff vector x G a (TV, E) is individually rational; 
namely, xi > E({T}) for all i G AT, or, equivalently, no player receives less 
than what he can get on his own. 

• Group rationality: Any payoff vector x G cr(AT, E) is group rational; namely, 
x(S ) > E(S) for all S C AT, or, equivalently, no group of players receives less 
than what the group can get on its own. 

• Symmetry: If the marginal contributions of two players i and j are the same 
for all S C AT \ {z, j}, then any payoff vector x G cr(AT, E) should assign the 
same value to i and j; that is, Xi = Xj. 

• Anonymity: For a game (AT, E) and any permutation i r of AT, 

cr(AT, V") = {x* :xe*(N, V)}, 

where E 7r (S') = E({7r(i)}^ e 5) and for any x G 5R n , x n is a vector such that 
x'l = x^iy That is, the names of the players would not affect their payoffs. 

• Superadditivity: Given two games (AT, V) and (AT, W), a (AT, V) + cr(AT, W) C 
cr(AT, E + E'), where (AT, E + E') is a game whose characteristic function is 
the sum of those of (AT, E) and (AT, V'). 

• Additivity: Given two games (AT, E) and (AT, E'), cr(AT, E + E') = cr(AT, E) + 
a(AT,E'). 

• Covariance under strategic equivalence: Given a game (AT, E), a scalar a > 0 
and (3 G 5ft n , cr(AT, cnE + /3) = aa(N , E) + /3, where in the game (AT, aE + /3), 
for any S' C AT, (<an + /3)(S) = aV(S) + /3(S). 

• Null player: For any x G cr(AT, E), = 0 if player i has zero marginal 

contribution added to any S; that is, E(S U {T}) = E(S). 

The efficiency property is sometimes referred to as the Pareto optimality. Group 
rationality implies individual rationality. Additivity implies superadditivity and 
covariance under strategic equivalence. 

Another desirable property for a solution concept is consistency. To present it, 
we need to define reduced games. Given a game (AT, E), for a nonempty set S C AT 
and a payoff vector x G 5P n , define a reduced game (S,V Xj s) as follows: E^s^) = 0, 
V XiS (S) = V(N)-x(N\S),z nd 

V xS (T)= max E(T U T') - x(T') V 0 / T c S. 

’ T'CN\S 
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A solution concept cr(TV, V ) is consistent if for any x E cr(TV, V), xs E <7(5, V Xj s), 
where xs — {%i)ies- It is called weakly consistent if for any x E a(N,V) and 
1 < \S\ < 2, xs E cr(S,V Xr s), where \S\ is the cardinality of S. The consistency 
properties basically say that if a payoff is reasonable in the grand coalition, any 
coalition should accept that the payoffs allocated to them are reasonable. 

Here are some monotonicity properties that a solution concept a(N,V) may 
want to satisfy. Among them, the coalitional monotonicity and strong monotonicity 
are defined for a single- valued solution concept. In this case, we simply use cr(TV, V) 
to denote the payoff vector when there is no confusion. 

• Aggregate monotonicity: For two games (TV, H) and (TV, V'), if V(N)> 
V'(N), V(S) = V'(S) for all S C TV, then for any x E a(N,V'), there 
exists y E cr (TV, V) such that yi > Xi for alii E TV. 

• Coalitional monotonicity: For two games (TV, V) and (TV, V 7 ), if V(T)> 
V'(T) for some T C TV and V(S) = V'(S) for all S with T ^ S C AT, 
then <j(TV, V)i > cr(N , V')i for any i E T. 

• Strong monotonicity: For two games (TV, V) and (TV, V') and any i E TV, 
cr(TV, V)i > cr(TV, V')i if 

V{S U {i}) - V{S) > V'(S U {i}) - V'(S) V S' C TV. 

In the following, we introduce the concepts of the core, the nucleolus, and the 
Shapley value and present their properties. 

3. 2. 1 Core 

The core of a game (TV, H), denoted by C(TV, V), is the set of efficient payoff vectors 
that are group rational. That is, 

C(TV, V) = {x E 5i n : x(N) = V(N),x(S) > V(S) V S C TV}. 

Clearly, the core is a polyhedral set that can be defined by 2 n linear inequalities. 
A payoff vector x in the core, called a core allocation, implies that based on this 
allocation x, no group of players has incentive to deviate from the grand coalition 
and thus the grand coalition is stable. It also implies that the allocation is fair in 
the sense that no group would subsidize its complement since for any S C TV, 

x(N \S) = x(N) - x(S) < V(N) - V(S); 

that is, the payoff to the group TV \ S is no more than its added value to the grand 
coalition. The core is attractive because it satisfies several plausible properties. 

Theorem 3.2.1 The core C(N , V) if nonempty satisfies the following properties: 
(a) efficiency , (b) individual rationality , (c) group rationality , (d) anonymity , (e) 
superadditivity, (f) covariance under strategic equivalence, (g) null player, (h) con- 
sistency, and (i) aggregate monotonicity. 
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Proof. We only prove parts (g) and (h). The other parts are straightforward. To 
prove part (g), note that for a null player, 7(511 {i}) = V(S) for any S C AT. 
Specifically, U({T}) = 0 and V(N \ {z}) = V(N). For any x G C(AT, U), we have 
that Xi > U({i}). On the other hand, 

Xi = x{N ) - x{N \ {i}) < V (AT) - V{N \ {i}) = 0. 



Thus, Xi = 0. 

We now show that the core is consistent. To see this, let x G C(AT, V). For any 
nonempty sets S C N and T C S', xs{S ) = x(N) — x(N \ S) = V Xj s(S), and there 
exists T' C N \ S such that 

V XtS (T) = V(T U T') - x(T') = V{T U T') - x(T U T') + x(T) < x s (T). 

Thus, Xg is in the core of the reduced game (S, V^s). I 

Unfortunately, the core of a game may be empty, and even if it is nonempty, 
finding an allocation in the core is usually computationally challenging. In fact, 
determining whether a given vector is in the core or not can be challenging as well 
since the core is defined by an exponential number (in n) of linear inequalities. 

In the following, we present a sufficient and necessary condition for the existence 
of a nonempty core credited to Bondareva (1963) and Shapley (1967). For this 
purpose, define for any S C TV, the characteristic vector \s as a vector in 5R n with 

i _ f l, if « e S' 

Xs \ 0, if i & S. 

A collection B of subsets of N is called balanced if there exist positive numbers 
5s, S G B, such that 

Y, fisXs = Xn- 
seB 

The collection {5g} seB is called a system of balancing weights associated with 
B. A game is called balanced if for any balanced collection B and any associated 
system of balancing weights {As'ls'es, 

SsV(S) < V(N). 

seB 

Theorem 3.2.2 (The Bondareva-Shapley theorem) A cooperative coalitional 
game (TV, V) has a nonempty core if and only if it is balanced. 



Proof. Notice that the core of (AT, V) is nonempty if and only if V (AT) is the optimal 
objective value of the linear program 



max x(N) 

s.t. x(S) >V(S),V SC AT. 
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Let y(S) be the dual variable associated with the inequality indexed by S. The 
dual of the above linear program is 

min T, S cn v ( s )v( s ) 

s-t- E S:ies 2/(50 = i,Vi e V, 

y(S) > o. 

Since V(N) is the objective value of the dual for the feasible solution {y°(S)}scN 
with y°(N) = 1 and y°(S ) = 0 for S C N. from the strong duality theorem, the 
core is nonempty if and only if for any feasible solution {y(S)}scN of the dual, 

V(N) < V(S)y(S). 

SCN 

For any feasible solution {y(S) }scn of the dual, we can define a balanced collec- 
tion B = {S : y(S) > 0} and its associated system of balanced weights Ss = y(S) 
for S G B. On the other hand, any system of balancing weights associated 

with a balanced collection B can be extended to be a feasible solution {y(S)}scN 
for the dual by defining y(S) = Ss if S G B and 0 otherwise. Therefore, the core 
is nonempty if and only if for any balanced collection B and its associated system 
of balancing weights {fels'e#, 

V(N) < V(S)y(S) = S sV(S); 

SCN SeB 

that is, the game is balanced. I 

Though general cooperative games may have empty cores, the core of a convex 
game is always nonempty and has a nice characterization. We will show that the 
following greedy algorithm will construct an extreme point of the core. Given a 
permutation 7 r of {1, 2, . . . , n}, let S? = {j : tt ( j) < i} for any i = 1, . . . , n and 
Sq = 0. Define for i = 1, . . . , TV, 

<0) = v(s?) - v(suy, 

that is, the payoff the greedy algorithm assigns to player it ( i) is the marginal 
contribution added to the players in SJ r _ 1 . 

Theorem 3.2.3 For a convex game (TV, V), the set of extreme points of the core 
is exactly {x n : tt is a permutation } . 

Proof. It is sufficient to prove that the payoff vector, denoted as x*, associated 
with the identity permutation 7 r with ir{i) = i for any i G N is an extreme point 
of the core. According to its definition, for i E TV, 

x^ i =V{S i )-V{S i - 1 )y ieN, 

and x = (xi , . . . , x n ), where 5^ = {1, 2, ... , i}. 
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We first show that x* is in the core. Notice that 

x*(N) = - V{Si-!)) = V(N). 

ieN 

For any S' C N, let S = {H, . . . , ik} with i\ < 12 . . . < and define for any j < fc, 
Sj = {ii, , ij}. We have for any j = 1, . . . , k, 

4 = vysg-nv 1) 

> n s} u {ij}) - viSi,-! n s ) 

= r(4-) - nii), 

where the inequality follows from the supermodularity of the characteristic func- 
tion V. Therefore, 



k k 

x*(S) = J2 x l Z J 2 (V(Sj) - V(Sj _,)) = V(S), 

3=1 3=1 



which implies that x is in the core. 

To show that x* is an extreme point of the core, consider the following opti- 
mization problem: 



min EieiV fi X i 

s.t. xeC(N,V), 



(3.4) 



where / = (/ 1 , . . . , f n ) is a given vector in with /1 > fi > . . . > f n . It suffices 
to show that x* is a unique optimal solution of the above optimization problem. 
For this purpose, note that for any given payoff vector x in the core, 



N fi X i — 



> 



J2ieN fi( x (Si) — x(Si-i)) 
EigiV\{n}(/ i — fi-l)x{Si) + f n x{S n ) 
EieN\{n}(fi - fi-lMSi) + fnV(N) 

ZitN fi(V(Si) -V(Si- 1 )) 



where the inequality holds since x E C(7V, V), and the last inequality follows from 
the definition of x* . Since fi is strictly decreasing in i, the inequality would hold 
as a strict inequality if x(Si) 7 ^ V(Si) for some i E N \ {n}. If x(Si) = V(Si) for 
all i E TV, x = x* . Hence, x* is the unique optimal solution of problem (3.4) and 
therefore an extreme point of the core. 

To show that no extreme point exists other than those in { x n : 7 r is a 
permutation}, it suffices to show that for any given vector /, the optimization 
problem (3.4) has an optimal solution in { x 77 : n is a permutation}. In fact, with- 
out loss of generality, assume that 



fi > h > • • • > fn- 



Following an analysis similar the one in the above paragraph, we can show that 
x* is optimal to problem (3.4). I 
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Of course, it is possible that x n = x n for different permutations 7 r and i r'. 
However, if the characteristic function F is strictly supermodular, namely, 

V(S) + V(T) < V(S UT) + b(SnT)V5,TCiV with S 2 T, T % S, 

then one can show that x 77 ^ x n (see Exercise 3.5). 

For cooperative games with empty cores, an alternative solution concept is the 
e-core. For a given real number e, the e-core C e of a game (TV, F) is given as 

C e = {x G : x(N) = V(N),x(S) > V(S) - e V 0 ^ S C TV}. 

That is, a payoff vector x is in the e-core if it is efficient and a group of players 
won’t be better off if it deviates from the grand coalition to form a subcoalition 
by paying a cost e. Clearly, C e C C e > if e < e' and C e is nonempty for large e. 

The least-core of a game (TV, F) is defined as the intersection of all nonempty 
e-cores of (TV, F), or equivalently C eo , where eo is the smallest e such that C e is 
nonempty. 

3.2.2 Nucleolus 

As shown in the previous subsection, the core may be empty and even if nonempty, 
it may not be unique. In contrast, the nucleolus exists and is unique. Since the 
nucleolus is a singleton, it is often referred to as the unique element in it. To 
present its definition, we first define the lexicographic order. A vector x £ 5R n is 
smaller than a vector y G in the lexicographic order if there exists an index 
k G {1, . . . , n — 1} such that 

Xi = yi Vi jC k,xjz - |_i yk- i-i* 

We use x <i ex y and x <i ex y if x is smaller than y and x is smaller than or equal 
to y in the lexicographic order, respectively. Consider a game (TV, V). For a given 
payoff vector x G 5ft n , define the excess of coalition S under payoff x as 

e x (S) = V{S) - x{S) V 0 7 ^ S' C TV, 

which is the difference of the value of and the proposed payoff to coalition S. Let 
9{e x ) be a vector in 5ft 2 ™ -2 whose components consist of the excesses e x (S) in 
decreasing order. That is, its first element is the largest e x (S), its second compo- 
nent is the second-largest e x (S), and so on. It is straightforward to show that for 
i = l,...,2 n — 2, Oi{e x ) is a continuous function of x. In addition, if x is in the 
core of (TV, F), then Oi{e x ) < 0 for all i = 1, . . . , 2 n — 2. 

The nucleolus of the game (TV, F), denoted by TF(TV, F), consists of optimal solu- 
tions minimizing 0{e x ) over all efficient payoff vectors in terms of the lexicographic 
order. That is, given a payoff vector x G A/"(TV, F), 

0{e x ) <iex 0( e y) f° r any efficient payoff vector y. 

One can argue that the nucleolus is the most equitable allocation because it se- 
quentially minimizes the excess of the worst-treated coalitions. 
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Theorem 3.2.4 For a given game (N,V), we have the following: 

(a) The nucleolus N(N ,V) is always nonempty and is a singleton. 

(b) If (N,V) is superadditive , the nucleolus is individually rational. 

(c) The nucleolus belongs to the core if the core is nonempty. 

Proof. To prove part (a), we first describe a procedure to find an element of the 
nucleolus. Given any efficient payoff vector y° , let 

S = max e v o(S). 

0/S'CiV y 

To search for the nucleolus, it suffices to focus on payoff vectors in the set 

Jo = {y € 3T : y(N) = y°(N), Vi > F({z}) - 5}, 

which is nonempty since y° G Io and compact. Since 0i(0 y ) is continuous in y , for 
i — 1, . . . , 2 n — 2, we can recursively show that 

li = argmin ye7 ._ i (9 i (e y ) 

is nonempty and compact as well. The set I^- 2 is exactly the nucleolus, which is 
clearly nonempty. 

We now show that the nucleolus is a singleton. Assume to the contrary that 
x,y G /2 n -2 with x ^ y. The above procedure implies that 0(e x ) = 0(e y ). Let 
*Si, $ 2 , ... , £ 2^-2 be an ordered sequence of all nonempty proper subsets of N so 
that 6i(e x ) = e x (Si) and let k be the smallest index such that 0 k {e y ) ^ e y (S k )• 
We assume that 

either e x (Si ) < e x (S k ) or e y (Si) < 0 k {e y ) = e x {S k ) V/ > k. 

This assumption can be made without loss of generality. In fact, if the assumption 
does not hold, there exists l with l > k such that e x (Si ) = e x (S k ) = e y (Si). We can 
simply consider a new sequence in which the positions of S k and Si are switched. 
Repeat the process until we end up with a sequence satisfying the assumption. 

Define z = For any l < k, since e x (Si) = e y (Si ), we have that 0i(z) = 
6i(x) = e z (Si ), and for any l > fc, 

e z (Si) = (ex(Si) + e y(Si))/2 < e x (S k ) = 0 k (e x ). 

Therefore, 0 k {e z ) < 0 k (e x ), which contradicts the assumption that x has the min- 
imum 0 k (e x ) in terms of the lexicographic order. Thus, the nucleolus is unique. 

To prove part (b), let x be the nucleolus and i G argmax jeAr (R({j}) — Xj). 
Assume to the contrary that R({i}) > xi. We claim i G S for any S C N such 
that 0i(e x ) = e x (S); that is, S G argmax^e^S'). In fact, if i 0 5, from the 
superadditive property of V, e x is superadditive and 
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e x {S U {*}) > e x (S) + e x ({i}) = e x {S) + V({i}) -x t > e x (S), 
which contradicts the definition of S. 

Define a new payoff vector y such that yi = Xi + e and yj = Xj — for j ^ i 
for some small e > 0. Pick any S G argmax^e^S'). Since i G S', we have that 

e y {S) = e x (S) - < e x (S) = 9 1 (e x ). 

For any S with e x (S ) < 0 i(e x ), as long as e is sufficiently small, y(S) — x(S) is 
small and 



e„(S) = e x (S) - ( y(S ) - x(S)) < 6 1 (e x ) - = 6y(5) ' 

Thus, S G argmax^ea,(*§) or, equivalently, e y (S) = 0 i(e y ). However, Qi(e y ) 
< 0i(e x ), which contradicts the definition of the nucleolus. Thus, the nucleolus 
is individually rational. 

Finally, if the core is nonempty, pick any core allocation z. Again, let x be the 
nucleolus. By definition, 

V(S) - x(S) = e x (S) < 0!(e x ) < O^e,) <OV0^cA 

Thus, the nucleolus is in the core. I 

Loosely speaking, if the core is nonempty, its nucleolus occupies a center position 
in the sense that the minimum distance to any boundary of the core is as large as 
possible. 

In addition to what we proved in the above theorem, the nucleolus satisfies the 
following properties. 

Theorem 3.2.5 The nucleolus M \N \V) satisfies the following properties: (a) ef- 
ficiency, (b) anonymity, (c) symmetry, (d) covariance under strategic equivalence, 
(e) null player, and (f) consistency. 

Parts (a) and (b) of the above theorem are obvious. However, the proof of the 
remaining parts is involved and is left as an exercise. 

3.2.3 Shapley Value 

The Shapley value is another single- valued solution concept. For a cooperative 
game (TV, V), the Shapley value (j) G is given by 

4>i= Yi ~ (V(S U {i}) - V(g)) V i e N. 

SCN\{i} 

The Shapley value can be interpreted as follows. The players arrive one by one 
to form the grand coalition, and all possible arrival sequences are equally likely. 
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When Player i arrives and finds a set of players S has arrived, he receives a payoff 
7(511 {z}) — V(S), his marginal contribution added to S. The Shapley value of 
Player i is his expected marginal contribution. To see this, note that there are n\ 
different arrival sequences in total and each arrival sequence, corresponding to a 
permutation of {1, 2, . . . , n}, is equally likely. In addition, the number of different 
arrival sequences in which the players in S are exactly those who arrive before i is 
\S\\(n — | S | — 1)!, the product of the number of different arrival sequences of the 
first | S' | players before i and that of the remaining n — \S\ — 1 players after i. 

The Shapley value is widely used because it is unique and enjoy several plausible 
properties. 

Theorem 3.2.6 The Shapley value satisfies the following properties: (a) efficiency , 
(b) symmetry, (c) anonymity, (d) additivity, (e) null player, (f) (strong) aggregate 
monotonicity, (g) coalitional monotonicity, and (h) strong monotonicity. 

To show that the Shapley value is efficient, recall that the Shapley value of player 
i can be interpreted as Player i’s expected marginal contribution added to its 
predecessors when any arrival sequence is equally likely. Thus, for any given arrival 
sequence, the total marginal contribution added to their predecessors of all players 
is exactly V(N). Thus, ^2 ieN 4>i = V(N). It is straightforward to show that the 
Shapley value satisfies the remaining properties. 

Interestingly, the Shapley value is a unique solution concept that satisfies the 
above properties. 

Theorem 3.2.7 The Shapley value is a unique solution concept that satisfies the 
efficiency, symmetry, additivity, and null player properties. It is also a unique 
solution concept that satisfies the efficiency, symmetry, and strong monotonicity 
properties. 

The proof of the theorem is left as an exercise. 

For a convex game, we have shown that for any permutation of {1,2,..., 
n}, 7 r, the greedy algorithm gives a payoff vector x n in the core, which is exactly 
the vector of marginal contributions added to their predecessors of the players 
if the arrival sequence is (7r(l), 7r(2), . . . , 7r(n)), with p(i) denoting the i arrival. 
Therefore, the Shapley value is the convex combination of payoff vectors x* for all 
permutations i r and thus is in the core. Actually, it is in the center of gravity of 
the core since the weights of the payoff vectors x n , the extreme points of the core, 
are 1 /n\. 

Theorem 3.2.8 For a convex game (N,V), the Shapley value is in the core. 

Unfortunately, for a nonconvex game, the Shapley value may not be core allo- 
cations even when the core is nonempty. 



3.3 Exercises 
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Exercise 3.1. Consider the generalized form of the prisoner’s dilemma in which 
the two prisoners’ payoffs are specified in general values (Table 3.4). Find the 

TABLE 3.4. The generalized form of the prisoner’s dilemma 





Silent 


Betray 


Silent 


A, 


A 


C, 


B 


Betray 


B, 


C 


D, 


D 



equilibria of the game for all possible values of A, B, C , and D. 

Exercise 3.2. Consider a game (TV, {^qjieiv)- If Si is one- 

dimensional and Ui is smooth for all i E TV, show that the diagonally dominant 
conditions (3.1) on the second derivatives of ui are sufficient for the best response 
operator to be a contraction and thus a unique equilibrium exists. Extend the 
diagonal dominance conditions to multidimensional case. 

Exercise 3.3. Provide the detailed proof of Theorem 3.1.7. 

Exercise 3.4. Prove that a game (TV, V) is convex if and only if (3.3) holds. 

Exercise 3.5. Show that for a cooperative game with a strictly supermodular 
characteristic function, a different permutation tt of {1, 2, . . . , n} leads to a different 
extreme point x n of its core in Sect. 3.2.1. 

Exercise 3.6. Prove that the nucleolus is a unique solution concept for the class 
of all cooperative coalitional games that satisfies the efficiency, consistency, co- 
variance under strategic equivalence, and anonymity properties. 

Exercise 3.7. Prove Theorem 3.2.5. 

Exercise 3.8. Prove Theorem 3.2.7. 

Exercise 3.9. Consider the cooperative coalitional game (TV, V) with TV={1, 2, 3} 
and the characteristic function given by 

^(0) = 0, V({i}) = 0 V i E TV, V(S) = a V |S| = 2, V(N) = 1, 

where a is parameter. Compute the core, the nucleolus, and the Shapley value. 

Exercise 3.10. Consider the cooperative production game in which k players 
collaborate to produce n products using m resources. Assume that player l (l = 
l,...,n) has a resource vector b l E Ji m . One unit of product j (j = 1 , ...,n) 
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can be sold at a profit rj and takes a^- units of resource i (i = 1 , ...,ra) to 
produce. To determine how the profit should be distributed among the players if 
they decide to cooperate, we can formulate this as a production game (K, V ) with 
K = {1,2, and for any coalition S', V(S) being the maximum profit the 

coalition can generate by pooling their resources together. Is the game convex? 
Show that it has a nonempty core. Is the Shapley value in the core? 



4 

Worst-Case Analysis 



4.1 Introduction 

Since most complicated logistics problems, for example, the bin-packing problem 
and the traveling salesman problem, are WP-Hard, it is unlikely that polynomial- 
time algorithms will be developed for their optimal solutions. Consequently, a great 
deal of work has been devoted to the development and analyses of heuristics. 
In this chapter, we demonstrate one important tool, referred to as worst-case 
performance analysis , which establishes the maximum deviation from optimality 
that can occur for a given heuristic algorithm. We will characterize the worst- 
case performance of a variety of algorithms for the bin-packing problem and the 
traveling salesman problem. The results obtained here serve as important building 
blocks in the analysis of algorithms for vehicle routing problems. 

Worst-case effectiveness is essentially measured in two different ways. Take a 
generic problem, and let I be a particular instance. Let Z*(I) be the total cost of 
the optimal solution, for instance, I. Let Z H (J) be the total cost of the solution 
provided by the heuristic H on instance /. Then, the absolute performance ratio 
of heuristic H is defined as 

R u = inf { r > 1 I ^ < r, for all I \ . 

I “ 1 Z*(I) ~ / 

This measure, of course, is specific to the particular problem. The absolute per- 
formance ratio is often achieved for very small problem instances. It is therefore 
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desirable to have a measure that takes into account problems of large size only. 
This measure is the asymptotic performance ratio. For a heuristic H, this ratio is 
defined as 






= inf | 



r > 1 I 3n such that 



Z H (/) 

— < r 

Z*(I ) ~ 5 



for all / with Z*(I) > n|. 



This measure sometimes gives a more accurate picture of a heuristic’s performance. 
Note that < R u . 

In general, it is important also to show that no better worst-case bound (for a 
given heuristic) is possible. This is usually achieved by providing an example, or 
family of examples, where the bound is tight, or arbitrarily close to tight. 

In this chapter, we will analyze several heuristics for two difficult problems, 
the bin-packing problem and the traveling salesman problem, along with their 
worst-case performance bounds. 



4.2 The Bin-Packing Problem 

The bin-packing problem (BPP) can be stated as follows: Given a list of n real 
numbers L = (wi,W2, . • . , rc n ), where we call Wi G (0, 1] the size of item i, the 
problem is to assign each item to a bin such that the sum of the item sizes in a 
bin does not exceed 1, while minimizing the number of bins used. For simplicity, 
we also use L as a set, but this should cause no confusion. In this case, we write 
i G L to mean Wi G L. 

Many heuristics have been developed for this problem since the early 1970s. 
Some of the more popular ones are first-fit (FF), best-fit (BF), first-fit decreasing 
(FFD), and best-fit decreasing (BFD) analyzed by Johnson et al. (1974). First- 
fit and best-fit assign items to bins according to the order they appear in the 
list without using any knowledge of subsequent items in the list; these are online 
algorithms. First-fit can be described as follows: Place item 1 in bin 1. Suppose we 
are packing item j\ place item j in the lowest indexed bin whose current content 
does not exceed 1 — Wj. The BF heuristic is similar to FF except that it places item 
j in the bin whose current content is the largest but does not exceed 1 — Wj . In 
contrast to these heuristics, FFD first sorts the items in nonincreasing order of their 
size and then performs FF. Similarly, BFD first sorts the items in nonincreasing 
order of their size and then performs BF. These are called offline algorithms. 

Let 6 h (L) be the number of bins produced by a heuristic H on list L. Similarly, 
let b*(L) be the minimum number of bins required to pack the items in list L; that 
is, b*(L) is the optimal solution to the bin-packing problem defined in list L. 

The best asymptotic performance bounds for the FF and BF heuristics are given 
in Garey et al. (1976), where they show that 
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and 

b BF (L)<\^b*(L) . 

Here \x] is defined as the smallest integer greater than or equal to x. 

The best asymptotic performance bounds for FFD and BFD have been obtained 
by Baker (1985), who shows that 

b FFD (L) < jb*(L)+ 3 

and 

6 bfd (L) < tb*(£) + 3 . 

Johnson et al. (1974) provides instances with arbitrarily large values of b*(L) such 
that the ratios ^ ^ and ^ ^ approach ^ and instances where h and 

^BFD (jj\ -t -t 

h *^ approach Thus, the maximum deviation from optimality for all lists 
that are sufficiently “large” is no more than 70 % times the minimal number of 
bins in the case of FF and BF, and 22.2 % in the case of FFD and BFD. 

We now show that by using simple arguments, one can characterize the absolute 
performance ratio for each of the four heuristics. We start, however, by demon- 
strating that in general we cannot expect to find a polynomial-time heuristic with 
absolute performance ratio less than |. 

Lemma 4.2.1 Suppose there exists a polynomial-time heuristic H for the BPP 
with R h < 3/2; then V = MV . 

Proof. We show that if such a heuristic exists, then we can solve the J\fV- Complete 
2-partition problem in polynomial time. This problem is defined as follows: Given 
a set A = {ai,a 2 , . . . , a n }, does there exist an Ai C A such that ^2 a . eAl cli = 
a $ 

For a given instance A of 2-partition, we construct an instance L of the bin- 
packing problem with items sizes cq and bins of capacity \ A a i Observe that if 
there exists an Ai such that ^ Al a i = ^ A \ Al ^i — \ Xm heuristic H 

must find a solution such that 6 H (L) = 2. On the other hand, if there is no such 
Ai in the 2-partition problem, then the corresponding bin-packing problem has no 
solution with fewer than three bins, and hence 6 H (L) > 3. 

Consequently, to solve the 2-partition problem, apply the heuristic H to the 
corresponding bin-packing problem. If b u (L) > 3, there is no subset Ai with the 
desired property. Otherwise, there is one. Since 2-partition is MV- Complete , this 
implies V = MV . I 

Let XF be either FF or BF, and let XFD be either FFD or BFD. In this section, 
we prove the following result due to Simchi-Levi (1994). 

Theorem 4.2.2 For all lists L, 

b XF (L ) ^ 7 
b*{L) ~ 4 
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and 

6 xfd (L) ^ 3 
b*{L) ~ 2' 

In view of Lemma 4.2.1, it is clear that FFD and BFD have the best possible 
absolute performance ratios for the bin-backing problem among all polynomial- 
time heuristics. As Garey and Johnson (1979, p. 128) point out, it is easy to 
construct examples in which an optimal solution uses two bins while FFD or BFD 
uses three bins. Similarly, Johnson et al. give examples in which an optimal solution 
uses 10 bins while FF and BF use 17 bins. Thus, the absolute performance ratio 
for FFD and BFD is exactly |, while it is at least 1.7 and no more than | for FF 
and BF. 

We now define the following terms, which will be used throughout this section. 
An item is called large if its size is (strictly) greater than 0.5; otherwise, it is called 
small. Define a bin to be of type I if it has only small items and of type II if it is 
not a type I bin; that is, it has at least one large item in it. A bin is called feasible 
if the sum of the item sizes in the bin does not exceed 1. An item is said to fit in a 
bin if the bin resulting from the insertion of this item is a feasible bin. In addition, 
a bin is said to be opened when an item is placed in a bin that was previously 
empty. 



4-2.1 First- Fit and Best- Fit 

The proof of the worst-case bounds for FF and BF, the first part of Theorem 4.2.2, 
is based on the following observation. Recall XF = FF or BF. 

Lemma 4.2.3 Consider the jth bin opened by XF (j >2). Any item that was 
assigned to it before it was more than half-full does not fit in any bin opened by 
XF prior to bin j . 



Proof. The property is clearly true for FF, and in fact holds for any item assigned 
to the jth bin, j > 2, not necessarily to items assigned to it before it was more 
than half-full. To prove the property for BF, suppose by contradiction that item i 
was assigned to the jth bin before it was more than half- full, and this item fits in 
one of the previously opened bins, say the kth bin. Clearly, in that case, i cannot 
be the first item assigned to the jth bin since BF would not have opened a new 
bin if i fits in one of the previously opened bins. Let the levels of bins k and j, 
just before the time item i was packed by BF, be and aj, and let item h be 
the first item in bin j. Hence, Wh < olj < \ by the hypothesis. Since BF assigns 
an item to the bin where it fits with the largest content, and item i would have fit 
in bin fc, we have olj > a^. Thus, meaning that item H would have fit in 

bin fc, a contradiction. I 

We use Lemma 4.2.3 to construct a lower bound on the minimum number of 
bins. For this purpose, we introduce the following procedure. For a given integer v , 
2 < v < 6 xf (L), select v bins from those produced by XF. Index the v bins in the 
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order they are opened, starting with 1 and ending with v. Let Xj be the set of items 
assigned by XF to the j th bin before it was more than half-full, j = 1, 2, . . . ,v. 
Let Sj be the set of items assigned by XF to the j th bin, j = 1 , 2 ,...,?;. Observe 
that Xj C Sj for all j = 1,2,...,?;. 

Procedure LBBP (Lower-Bound Bin-Packing) 

Step 1: Let X[ = Xi, i = 1,2,...,?;. 

Step 2: For i = 1 to v — 1, do 

Let j = ma x{k : X' k / 0}. 

If j < i, stop. 

Else, let ?? be the smallest item in Xj. 

Set Si <— Si U {??} and X '• «— Xj\{??}. 

In view of Lemma 4.2.3, it is clear that Procedure LBBP generates nonempty 
subsets S m , for some m < v, such that ZZies- Wi > 1 for j < m — 1 and 

possibly for j = m. This is true since by Lemma 4.2.3, item u (as defined in the 
LBBP procedure), originally assigned to bin j before it was more than half- full, 
does not fit in any bin i with i < j. Then the following must hold. 



Proof. Since bins 1, 2, ... ,m — 1 generated by Procedure LBBP are not feasible, 
we have YZj=i ZZies- Wi > m ~ 1- Note that every item in Uj= m +i ^0 move d by 
Procedure LBBP to exactly one Sj, j = 1, 2, . . . , m — 1, and possibly to S m . Thus, 
if Sm is feasible, that is, no (additional) item is assigned by Procedure LBBP to 
*S m , then | Uj= m +i Xj | < 771 — 1 < J2j=i ZZies- Wi ‘ °fber hand, if an item is 

assigned by Procedure LBBP to 5 m „ then none of the subsets Sj, j = 1, 2, . . . , m, 
is feasible, and therefore, m = | Uj=ra+1 Xj\ < 1 ZZi^Sj ® 

We are now ready to prove the first part of Theorem 4.2.2, that is, establish the 
upper bound on the absolute performance ratio of the XF heuristic. Let c be the 
number of large items in the list L. Without loss of generality, assume b XF (L) > c 
since otherwise the solution produced by XF is optimal. So b XF (L) — c > 0 is the 
number of type I bins produced by XF. We consider the following two cases. 

Case 1: cis even. In this case, we partition the bins produced by XF into two sets. 
The first set includes only type I bins, while the second set includes the remaining 
bins produced by XF, that is, all the type II bins. Index the bins in the first set in 
the order they are opened, from 1 to b XF (L) — c. Let v = b XF (L) — c, and apply 
Procedure LBBP to the set of type I bins, producing m bins out of which at least 
771 — 1 are infeasible. Then 

Lemma 4.2.5 If c is even , 



Lemma 4.2.4 max 




} < E'„ E 



max 
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Proof. Combining Lemma 4.2.4 with the fact that no two large items fit in the 
same bin, we have ^Z ieL Wi > m — 1 + |. On the other hand, every bin in an optimal 
solution is feasible, and therefore, ^Z ieL Wi < b*(L). Since c is even, m + | < b*(L). 
Since we applied Procedure LBBP only to the type I bins produced by XF, each 
one of these bins has at least two items except possibly one that may have only 
one item. Hence, 2 (b XF (L) — m — c— 1) + 1 < | Uj= m +i I anc ^ therefore, using 
Lemma 4.2.4, 



Proof From Lemma 4.2.5, we have 2(6 XF (H) — m) — ^ < b*(L). Hence, 



Case 2: c is odd. In this case, we partition the set of all bins generated by the XF 
heuristic in a slightly different way. The first set of bins, called Hi, comprises all 
the type I bins except the last type I bin opened by XF. The second set is made 
up of the remaining bins; that is, these are all the type II bins together with the 
type I bin not included in B\. We now apply Procedure LBBP to the bins in Hi 
[with v = b XF (L) — c — 1], producing m bins out of which at least m — 1 bins are 
not feasible. 

Lemma 4.2.7 If c is odd, 



2 (6 xf (L) — m — c— 1) + ^ + 1 < 



2{b XF (L) -m-c- 1) +'| + 2 < b*(L). 



Rearranging the left-hand side gives the second lower bound. 
Theorem 4.2.6 If c is even, 



& xf (L) < 7 -b*(L). 




since m + |, b*(L) and c are lower bounds. 



max 




1 

2 




Proof Take one of the type II bins and “match” it with the only type I bin not in 
Hi; the total weight of these two bins is more than 1. Thus, using Property 4.2.4, 
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we have + 1 + (m — 1) < YUeL < fr*(L), which proves the first lower bound. 
To prove the second lower bound, we use the fact that every bin in B\ has at least 
two items, and therefore, 2(6 XF (L)— m — c— 1) < | Uj= m +i |. Using Property 2.2, 
we get 

2 (b XF {L) — m — c — 1) + C 1 + 1 < ^ Wi < b*(L) 

2 ieL 

or 

2(6 xf (L) - to - c - 1) + ^1 + 2 < b*(L). 

Rearranging the left-hand side gives the second lower bound. I 

Theorem 4.2.8 If c is odd , 



b XF (L) < 7 -b*(L) - 1 



Proof. From Lemma 4.2.7, we have 2 (b XF (L) — to) — ^ | < b*(L). Hence, 



b XF (L) < 



b*{L) 



3c 1 



b*(L) ( c \\ c 1 

4 i + ( m+ 2 + 2 ) + 4-4 






4-2.2 First- Fit Decreasing and Best- Fit Decreasing 

The proof of the worst-case bounds for FFD and BFD is based on Lemma 4.2.3. 
This lemma states that if a bin produced by these heuristics contains only items 
of size at most \ , then the first two items assigned to the bin cannot fit in any bin 
opened prior to it. 

Let XFD denote either FFD or BFD. Index the bins produced by XFD in the 
order they are opened. We consider three cases. First, suppose b XFB (L) = 3 p for 
some integer p > 1. Consider the bin with index 2p + 1. If this bin contains a 
large item, we are done since in that case b*(L) >2 p— |6 XFD (L). Otherwise, bins 
2p + 1 through 3 p must contain at least 2p — 1 small items, none of which can fit 
in the first 2 p bins. Hence, the total sum of the item sizes exceeds 2p — 1, meaning 
that b*(L) >2 p— |6 XFD (L). 

Suppose b XFD (L) = 3p + 1. If bin 2p + 1 contains a large item, we are done. 
Otherwise, bins 2p + 1 through 3p + 1 contain at least 2p + 1 small items, none 
of which can fit in the first 2 p bins, implying that the total sum of the item sizes 
exceeds 2 p and hence b*(L) > 2p + 1 > |6 XFD (L). 
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Similarly, suppose b XFB (L) = 3p + 2. If bin 2p + 2 contains a large item, we are 
done. Otherwise, bins 2p + 2 through 3p + 2 contain at least 2p + 1 small items, 
none of which can fit in the first 2p + 1 bins, implying the sum of the item sizes 
exceeds 2p + 1 , and hence 6*(L) > 2p + 2 > |6 XFD (L). 



4.3 The Traveling Salesman Problem 



Interesting worst-case results have been obtained for another combinatorial prob- 
lem that plays an important role in the analysis of logistics systems: the traveling 
salesman problem (TSP). The problem can be defined as follows: Let G = (V,E) 
be a complete undirected graph with vertices V, \V\ = n, and edges E, and let 
dij be the length of edge (i, j). [We use the term length to designate the “cost” of 
using edge (z, j). The most general formulation of the TSP allows for completely 
arbitrary “lengths,” and, in fact, in many applications the physical distance is 
irrelevant and the dij simply represents the cost of sequencing j immediately after 
i.] The objective in the TSP is to find a tour that visits each vertex exactly once 
and whose total length is as small as possible. The problem has been analyzed 
extensively in the last three decades; see Lawler et al. (1985) for an excellent sur- 
vey and, in particular, the chapter written by Johnson and Papadimitriou (1985), 
which includes some of the worst-case results presented here. 

We shall examine a variety of heuristics for the TSP and show that, for an 
important special case of this problem, heuristics with strong worst-case bounds 
exist. We start, however, with a negative result, due to Sahni and Gonzalez (1976), 
which states that, in general, finding a heuristic for the TSP with a constant worst- 
case bound is as hard as solving any MV- Complete problem, no matter what the 
bound. 

To present the result, let I be an instance of the TSP. Let L*(I) be the length 
of the optimal traveling salesman tour through V. Given a heuristic H, let L H (J) 
be the length of the tour generated by H. 



Theorem 4.3.1 Suppose there exist a polynomial-time heuristic H for the TSP 
and a constant i7 H such that for all instances I, 



then V = MV . 



L*{I) 

L*(I) 



< R 



H. 



Proof. The proof is in the same spirit as the proof of Lemma 4.2.1. Suppose such 
a heuristic exists. We will use it to solve the MV- Complete Hamiltonian cycle 
problem in polynomial time. The Hamiltonian cycle problem is defined as follows. 
Given a graph G = (V,E), does there exist a simple cycle (a cycle that does not 
visit a point more than once) in G that includes all of VI To answer this question, 
we construct an instance I of the TSP and apply H to it; the length of the tour 
generated by H will tell us whether G has a Hamiltonian cycle. 
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The instance I is defined on a complete graph whose set of vertices is V and 
the length of each edge {i,j} is 

d _ f 1, if j [i,j}eE; 

^ \ V | _R H , otherwise. 

We distinguish between two cases depending on whether G contains a Hamilto- 
nian cycle. If G does not contain a Hamiltonian cycle, then any traveling salesman 
tour in I must contain at least one edge with length \V\ i7 H , and hence the length 
of the tour generated by H is at least |H|i7 H + \V\ — 1. 

On the other hand, if G has a Hamiltonian cycle, then / must have a tour of 
length \V\. This is true since we can use the Hamiltonian cycle as a traveling 
salesman tour for the instance / in which the vertices appear on the traveling 
salesman tour in the same order they appear in the Hamiltonian cycle. Thus, if G 
has a Hamiltonian cycle, heuristic H applied to I must provide a tour of length no 
more than \V\ i7 H . 

Consequently, we have a method for solving the Hamiltonian cycle problem: 
Apply H to the TSP defined on the instance I. If L H (J) < \V\ i7 H , then there 
exists a Hamiltonian cycle in G. Otherwise, there is no such cycle in G. Finally, 
since H is assumed to be polynomial, we conclude that V = NT. I 

The theorem thus implies that it is very unlikely that a polynomial-time heuris- 
tic for the TSP with a constant absolute worst-case bound exists. However, there 
is an important version of the traveling salesman problem that excludes the above 
negative result. This is when the distance matrix {dij} satisfies the triangle in- 
equality assumption. 

Definition 4.3.2 A distance matrix satisfies the triangle inequality assumption if 
for all i,j, k Gb , we have dij < dik + dkj. 

In many logistics environments, the triangle inequality assumption is not a very 
restrictive one. It merely states that traveling directly from point (vertex) i to 
point (vertex) j is at most the cost of traveling from i to j through the point k. 

In the next four sections, we describe and analyze different heuristics developed 
for the TSP. To simplify the presentation in what follows, we write L* instead of 
L*(J); this should cause no confusion. 

4-3.1 A Minimum Spanning Tree-Based Heuristic 

The following algorithm provides a simple example of how a fixed worst-case bound 
is possible for the TSP when the distance matrix satisfies the triangle inequality 
assumption. In this case, the bound is 2; that is, the heuristic provides a solution 
with total length at most 100 % above the length of an optimal tour. 

A spanning tree of a graph G = (V, E) is a connected subgraph with \V\ — 1 
edges spanning all of V. The cost (or weight) of a tree is the sum of the length of 
the edges in the tree. A minimum spanning tree (MST) is a spanning tree with 
minimum cost. It is well known and easy to show that a minimum spanning tree 
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can be found in polynomial time [see, for example, Papadimitriou and Stieglitz 
(1982)]. If W* denotes the weight (cost) of the minimum spanning tree, then we 
must have IF* < L* since deleting any edge from the optimal tour results in a 
spanning tree. 

The minimum spanning tree can be used to find a feasible traveling salesman 
tour in polynomial time. The idea is to perform a depth-first search [see Aho et al. 
(1974)] over the minimum spanning tree and then to do simple improvements on 
this solution. Formally, this is done as follows (Johnson and Papadimitriou 1985). 

A Minimum Spanning Tree-Based Heuristic 

Step 1: Construct a minimum spanning tree and color its edges white, and all 
other edges black. 

Step 2: Let the current vertex (denoted v) be an arbitrary vertex. 

Step 3: If one of the edges adjacent to v in the MST is white, color it black and 
proceed to the vertex at the other end of this edge. Else (all edges from 
v are black), go back along the edge by which the current vertex was 
originally reached. 

Step 4-' Let this vertex be v. Stop if v is the vertex you started with and all edges 
of MST are black. Otherwise, go to step 3. 



Observe that the above strategy produces a tour that starts and ends at one 
of the vertices and visits all other vertices in the graph covering each arc twice. 
This is not a very efficient tour since some vertices may be visited more than once. 
To improve on this tour, we can modify the above strategy as follows: Instead of 
going back to a visited vertex, we can use a shortcut strategy in which we skip 
this vertex, and go directly to the next unvisited vertex. The triangle inequality 
assumption implies that the above modification will not increase the length of the 
tour and, in fact, may reduce it. 

Let L mst be the length of the traveling salesman tour generated by the above 
strategy. We clearly have 

L mst < 2VF* < 2 L*, 

where the first inequality follows since without shortcuts, the length of the tour 
is exactly 2 W* . This proves that the worst-case bound of the algorithm is at 
most 2. It remains to verify that the worst-case bound of this heuristic cannot be 
improved. For this purpose, consider Fig. 4.1, the example constructed by Johnson 
and Papadimitriou (1985). Here, W * = ^ + ^(1 — e) + 2e— 1, L MST ~ ^ + ^(1 — e), 
andL* = ^. 
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4-3.2 The Nearest- Insertion Heuristic 

Before describing this heuristic, we consider the following intuitively appealing 
strategy, called the nearest- neighbor heuristic. Given an instance I of the TSP, 
start with an arbitrary vertex and find the vertex not yet visited that is closest to 
the current vertex. Travel to this vertex. Repeat this until all vertices are visited; 
then go back to the starting vertex. 

Unfortunately, Rosenkrantz et al. (1977) show the existence of a family of in- 
stances for the TSP with arbitrary n with the following property. The length of the 
tour generated by the nearest-neighbor heuristic on each instance in the family is 
O(logn) times the length of the optimal tour. Thus, the nearest-neighbor heuristic 
does not have a bounded worst-case performance. 

This comes as no surprise since the algorithm obviously suffers from one major 
weakness. This “greedy” strategy tends to begin well, inserting very short arcs 
into the path, but ultimately it ends with arcs that are quite long. For instance, 
the last edge added, the one connecting the last node to the starting node, may be 
very long due to the fact that at no point does the heuristic consider the location 
of the starting vertex and possible ending vertices. 

One way to improve the performance of the nearest-neighbor heuristic is pre- 
sented in the following variant, called the nearest-insertion (NI) heuristic, devel- 
oped and analyzed by Rosenkrantz et al. Informally, the heuristic works as follows: 
At each iteration of the heuristic, a Hamiltonian cycle containing a subset of the 
vertices is constructed. The heuristic then selects a new vertex not yet in the cycle 
that is “closest” in a specific sense and inserts it between two adjacent vertices in 




I — i—l 

The rorwnum spanning tree 




The tour generated by the 
Minimum Spanning Tree Based Algorithm 



FIGURE 4.1. An example for the minimum spanning tree-based algorithm with n — 18 
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the cycle. The process stops when all vertices are in the cycle. Formally, this is 
done as follows. 

The Nearest-insertion Heuristic 

Step 1: Choose an arbitrary node v and let the cycle C consist of only v. 

Step 2: Find a node outside C closest to a node in (7; call it k. 

Step 3: Find an edge {i,j} in C such that dik + dkj — d \j is minimal. 

Step 4-' Construct a new cycle C by replacing {i,j} with {i,k} and {k,j}. 

Step 5: If the current cycle C contains all the vertices, stop. Otherwise, go to 
step 2. 

Let L ni be the length of the solution obtained by the nearest-insertion heuristic. 
Then 

Theorem 4.3.3 For all instances of the TSP satisfying the triangle inequality, 

L m < 2L * 

We start by proving the following interesting result. Let T be a spanning tree 
of G and let W(T) be the weight (cost) of that tree; that is, W{T ) is the sum of 
the length of all edges in the tree T. Then 

Lemma 4.3.4 For every spanning tree T, 

L m < 2 W{T). 

Proof. We prove the lemma by matching each vertex we insert during the execution 
of the algorithm with a single edge of the given tree T. To do that, we describe a 
procedure that will be carried out in parallel to the nearest-insertion heuristic. 

The Dual Nearest-insertion Procedure 

Step 1: Start with a family T of trees that, at first, consists of only the tree T. 

Step 2: Given k (the vertex selected in step 2 of NI), find the unique tree in T 
containing k. Let this tree be X&. 

Step 3: Let £ be the unique vertex in T that is in the current cycle. 

Step 4 : Let h be the vertex adjacent to £ on the unique path from t to k. Replace 
Tfc in T with two trees obtained from by deleting edge {£, h}. 

Step 5: If T contains n trees, stop. Otherwise, go to step 2. 
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The dual nearest-insertion procedure is carried out in parallel to the nearest- 
insertion heuristic in the sense that each time step 1 is performed in the latter 
procedure, step 1 is performed in the former procedure. Each time step 2 is per- 
formed in the latter, step 2 is performed in the former, etc. 

Observe that each time step 4 of the dual nearest-insertion procedure is per- 
formed, the set of trees T is updated so that each tree in T has exactly one vertex 
from the current cycle and each vertex of the current cycle belongs to exactly one 
tree. This is true since when edge {£, h} is deleted, two subtrees are constructed, 
one containing the vertex t and the other containing the vertex k. Edge {£, h} is 
the one we associate with the insertion of vertex k. 

Let m be the vertex in the current cycle to which vertex k (not in the cycle) 
was closest. That is, m is the vertex such that dkm is the smallest among all d uv , 
where u is in the cycle and v is outside the cycle. Let m-\- 1 be one of the vertices 
on the cycle adjacent to m. Finally, let edge {i,j} be the edge deleted from the 
current cycle. Clearly, inserting k into the current cycle increases the length of the 
tour by 

dik T dfcj dij ft dmk T dk ^ m - |_i d m?m _ |_i ft e 2d rn k' l 

where the left-hand inequality holds because of step 3 of the nearest-insertion 
heuristic and the right-hand inequality holds in view of the triangle inequality 
assumption. Of course, this is true only when the cycle contains at least two 
vertices. When it contains exactly one vertex, that is, when the nearest-insertion 
algorithm enters step 2 for the first time, inserting k into the current cycle increases 
the length of the tour by exactly 2 d m k- 

Since £ is in the current cycle and h is not, d m k < d^. Hence, the increase in 
the cost of the current cycle is no more than 2d^. Finally, since this relationship 
holds for every edge of T and the corresponding inserted vertex, we have 

L ni < 2 W(T). 



To finish the proof of Theorem 4.3.3, apply Theorem 4.3.4 with T*; thus, 

w* = W(T*) < L* < L ni < 2W(T*). 

This completes the proof of the theorem. 

To see that the bound is tight, consider the example (constructed by Rosenkrantz 
et al. 1977) depicted in Fig. 4.2. In this example, the length of every edge connecting 
two consecutive vertices on the perimeter is 1 while all other edges have length 
2. Thus, the optimal traveling salesman tour visits the vertices according to their 
appearance on the circle, and therefore, L* = n. It is easy to see that the nearest- 
insertion heuristic generates the tour depicted in Fig. 4.2b with cost L NI = 2n — 2. 
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The tour generated by the 
Nearest insertion Algorithm 



FIGURE 4.2. An example for the nearest-insertion algorithm with n = 8 



4-3.3 Christofides ; Heuristic 

In 1976, Christofides presented a very simple algorithm that currently has the 
best-known worst-case performance bound for the TSP. To present the algorithm, 
we need to state several properties of graphs. 

Lemma 4.3.5 Given a graph with at least two vertices, the number of vertices 
with odd degree is even. 

Definition 4.3.6 An Eulerian tour is a tour that traverses all edges of a graph 
exactly once. 

Definition 4.3.7 An Eulerian graph is a graph that has an Eulerian tour. 

Then it is a simple exercise to show the following. 

Lemma 4.3.8 A connected graph is Eulerian if and only if the degree of each 
vertex is even. 

Christofides’ algorithm starts with a minimum spanning tree. Of course, this 
tree (as any other tree) is not Eulerian, since some of the vertices have odd degree. 
We can augment the graph (by adding suitably chosen arcs) so that it becomes 
Eulerian. In fact, we would like to add a number of arcs connecting odd-degree 
vertices so that they then have even degree. To do this, we will find a minimum 
weight matching among the odd-degree vertices. 

Given a graph with an even number of vertices, a matching is a subset of edges 
with the property that every vertex is the endpoint of exactly one edge of the sub- 
set. In the minimum- weight matching problem, the objective is to find a matching 
whose total length of all its edges is minimum. This problem can be solved in 
0(n 3 ), where n is the number of vertices in the graph [see Lawler (1976)]. 

Lemma 4.3.5 tells us that the number of vertices with odd degree in the MST is 
even. Thus, adding the edges of a matching defined on those odd-degree vertices 
clearly increases the degree of each of these vertices by one. The resulting graph 
is Eulerian, by Lemma 4.3.8. Of course, to minimize the total cost, we would 
like to select the edges of a minimum- weight matching. Finally, the Eulerian tour 
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generated is transformed into a traveling salesman tour using shortcuts, similarly 
to what was done in the minimum spanning tree-based heuristic of Sect. 4.3.1. 

Let L c be the length of the tour generated by Christofides’ heuristic. We prove 
the following: 

Theorem 4.3.9 For all instances of the TSP satisfying the triangle inequality, 
we have 

L c < -L*. 

~ 2 

Proof. Recall that VP* = W(T*) is the cost of the MST and let VF(M*) be the 
weight of the minimum- weight matching, that is, the sum of edge lengths of all 
edges in the optimal matching. Because of the triangle inequality assumption, 

L c < W(T*) + W(M*). 

We already know that W(T*) < L*. It remains to show that W(M*) < \L* . 
For this purpose, index the vertices of odd degree in the minimum spanning 
tree ii, k, • • • , kk according to their appearance on an optimal traveling sales- 
man tour. Consider two feasible solutions for the minimum- weight matching prob- 
lem defined on these vertices. The first matching, denoted M 1 , consists of edges 
{h,k}, { 23 , £ 4 }, • • • , {kk-ukk}- The second matching, denoted M 2 , consists of 
edges { i 2 , * 3 }, {u, ^ 5 }, • ■ , {kk, h}- 

We clearly have W(M*) < |[W(M 1 ) + W(M 2 )]. The triangle inequality as- 
sumption tells us that W(M X ) + W(M 2 ) < L*; see Fig. 4.3. Hence, W(M*) < 
and, consequently, 

L* < W(T*) + W{M*) < ^L*. 

I 

As in the two previous heuristics, this bound is tight. Consider the example 
depicted in Fig. 4.4 for which L* = n while L c = n — 1 + 




FIGURE 4.3. The matching and the optimal traveling salesman tour 
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FIGURE 4.4. An example for Christofides’ algorithm with n = 7 



4.3.4 Local Search Heuristics 

Some of the oldest and, by far, the most extensively used heuristics developed for 
the traveling salesman problem are the so-called k- opt procedures (k > 2 ). These 
heuristics, part of the extensive class of local search procedures, can be described 
as follows. Given a traveling salesman tour through the set of vertices V, say the 
sequence 

{ii,i 2 • • • 5 iui 5 iu 2 5 • • • 5 iv 1 5 iy 2 5 • • • 5 in\ •> 

an I- exchange is a procedure that replaces i edges currently in the tour by i 
new edges so that the result is again a traveling salesman tour. For instance, a 
2-exchange procedure replaces edges {i Ul Su 2 } and {v,i V2 } with {i Ul ,i Vl } and 
{i U2 , i V2 } and results in a new tour: 



\ii 1 ^2 • • ■ 5 iu\ 1 iy\ 1 iy 1 — 1 5 • • • 1 iu 2 1 iv 2 "> ^2+1 5 ■ ■ ■ 5 in } • 



An improving ^-exchange is an ^-exchange that results in a tour whose total length 
(cost) is smaller than the cost of the original tour. 

A k- opt procedure starts from an arbitrary traveling salesman tour and, us- 
ing improving ^-exchanges, for t < &, successively generates tours of smaller and 
smaller length. The procedure terminates when no improving ^-exchange is found 
for all t < k. Let L OPT ^ be the length of the tour generated by a k- opt heuristic, 
for k > 2. 

Recently, Chandra et al. ( 1999 ) obtained interesting results on the worst-case 
performance of the k- opt heuristic. They show 



Theorem 4 . 3.10 For all instances of the TSP satisfying the triangle inequality, 
we have 



l opt(2) 

L* 



< 4 \fn. 



In addition, there exists an infinitely large family of TSP instances satisfying the 
triangle inequality assumption for which 



l opt(2) 

L* 




They also provide a lower bound on the worst-case performance of k- opt for all 
k > 3 . 
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Theorem 4.3.11 There exists an infinitely large family of TSP instances satis- 
fying the triangle inequality assumption with 

L OPT(k) x ^ 

> -n 2k 



for any k > 2. 

Thus, the above results indicate that the worst-case performances of k - opt 
heuristics are quite poor. By contrast, many researchers and practitioners have 
reported that k - opt heuristics can be highly effective; see, for instance, Golden 
and Stewart (1985). 

This raises a fundamental dilemma. Although worst-case analysis provides a 
rigid guarantee on a heuristic’s performance, it suffers from being highly deter- 
mined by certain pathological examples. Is there a more appropriate measure to 
assess the effectiveness of a particular heuristic, one that would assess the effec- 
tiveness on an average or realistic example? We will try to address this question 
in the next chapter. 
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Exercise 4.1. Prove Lemma 4.3.8. 

Exercise 4.2. The 2-TSP is the problem of designing two tours that together 
visit each of the customers and use the same starting point. Show that any algo- 
rithm for the TSP can solve this problem as well. 

Exercise 4.3. (Papadimitriou and Stieglitz 1982) Consider the n-city TSP in 
which the triangle inequality assumption holds. Let c* > 0 be the length of an 
optimal tour, and let d be the length of the second-best tour. Prove {d — c*)/c* < 

n ' 

Exercise 4.4. Prove that in every completely connected directed graph (a graph 
in which between every pair of vertices there is a directed edge in one of the two 
possible directions), there is a directed Hamiltonian path. 

Exercise 4.5. Let Z c be the length of the tour provided by Christofides’ heuris- 
tic, and let Z* be the length of the optimal tour. Construct an example with 
Z c = §Z*. 

Exercise 4.6. Prove that for any graph G, there exists an even number of nodes 
with odd degree. 



82 



4. Worst-Case Analysis 



Exercise 4.7. Let G be a tree with n > 2 nodes. Show that 

(a) There exist at least two nodes with degree 1. 

(b ) The number of arcs is n — 1 . 

Exercise 4.8. Consider the n-city TSP defined with distances dij. Assume that 
there exist a, b £ lR n such that for each i and j, dij = ai + bj. What is the length 
of the optimal traveling salesman tour? Explain your solution. 

Exercise 4.9. Consider the TSP with the triangle inequality assumption and 
two prespecified nodes 8 and t. Assume that the traveling salesman tour has to 
include edge (s, t) (that is, the salesman has to travel from s directly to t). Modify 
Christofides’ heuristic for this model and show that the worst-case bound is §. 

Exercise 4.10. Show that a minimum spanning tree T satisfies the following 
property. When T is compared with any other spanning tree T', the /cth-shortest 
edge of T is no longer than the fcth-shortest edge of T', for k = 1, 2, . . . , n — 1. 

Exercise 4.11. (Papadimitriou and Stieglitz 1982) The wandering salesman 
problem (WSP) is a traveling salesman problem except that the salesman can 
start wherever he or she wishes and does not have to return to the starting city 
after visiting all cities. 

(a) Describe a heuristic for the WSP with worst-case bound §. 

(b ) Show that the same bound can be obtained for the problem when one of the 
endpoints of the path is specified in advance. 

Exercise 4.12. (Papadimitriou and Stieglitz 1982) Which of the following prob- 
lems remain essentially unchanged (complexity- wise) when they are transformed 
from minimization to maximization problems? Why? 

(a) Traveling salesman problem 

(b) Shortest path from s to t 

(c) Minimum- weight matching 

(d) Minimum spanning tree 



Exercise 4.13. Suppose there are n jobs that require processing on m machines. 
Each job must be processed by machine 1, then by machine 2, . . . , and finally by 
machine m. Each machine can work on at most one job at a time, and once it 
begins work on a job it must work on it until completion, without interruption. 
The amount of time machine j must process job i is denoted pij > 0 (for i = 
1, 2, . . . , n and j = 1, 2, . . . , rrt). Further suppose that once the processing of a job 
is completed on machine j, its processing must begin immediately on machine j + 1 
(for j < m — 1). This is a flow shop with no wait-in-process. 
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Show that the problem of sequencing the jobs so that the last job is completed 
as early as possible can be formulated as an (n + l)-city TSP. Specifically, show 
how the dij values for the TSP can be expressed in terms of the pij values. 

Exercise 4.14. Consider the bin-packing problem with items of size Wi, i = 
1,2 such that 0 < Wi < 1. The objective is to find the minimum number 
of unit-size bins 6* needed to pack all the items without violating the capacity 
constraint. 

(a) Show that w i a l° wer bound on 6*. 

( b ) Define a locally optimal solution to be one where no two bins can be feasibly 
combined into one. Show that any locally optimal solution uses no more than 
twice the minimum number of bins, that is, no more than 26* bins. 

(c) The next-fit heuristic is the following. Start by packing the first item in bin 
1. Then each subsequent item is packed in the last-opened bin if possible, 
or else a new bin is opened and it is placed there. Show that the next-fit 
heuristic produces a solution with at most 26* bins. 

Exercise 4.15. (Anily et al. 1994) Consider the bin-packing problem and the 
next-fit increasing heuristic. In this strategy, items are ordered in a nondecreasing 
order according to their size. Start by packing the first item in bin 1. Then each 
subsequent item is packed in the last-opened bin if possible, or else a new bin is 
opened and it is placed there. Show that the number of bins produced by this 
strategy is no more than | times the optimal number of bins. For this purpose, 
consider the following two steps. 

(а) Consider the following procedure. First, order the items in nondecreasing 
order of their size. When packing bin i > 1, follow the packing rule: If the 
bin is currently feasible (i.e., the total load is no more than 1), then assign 
the next item to this bin; otherwise, close this bin, open bin i + 1, and put this 
item in bin i + 1. Show that the number of bins generated by this procedure 
is a lower bound on the minimal number of bins needed. 

(б) Relate this lower-bounding procedure to the number of bins produced by the 
next-fit increasing heuristic. 

Exercise 4.16. Given a network G = (V, E), and edge length l e for every e G E, 
assume that edge (u, v ) has a variable length x. Find an expression for the length 
of the shortest path from s to t as a function of x. 

Exercise 4.17. A complete directed network G = (V, A) is a directed graph such 
that for every pair of vertices u, v G V, there are arcs u —> v and v — )> u in A with 
nonnegative arc lengths d{u,v) and d(v,u), respectively. The network G = (V, A) 
satisfies the triangle inequality if, for all u,v,w G V, d(u , v) + d(v, w) > d(u , w). 
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A directed cycle is a sequence of vertices v\ -A • • • -A Vf> v\ without 

any repeated vertex other than the first and last ones. If the cycle contains all the 
vertices in G, then it is said to be a directed Hamiltonian cycle. To keep notation 
simple, let dij = d(vi,Vj). 

A directed cycle containing exactly k vertices is called a &-cycle. The length of 
a cycle is defined as the sum of arc lengths used in the cycle. A directed network 
G = (V, A) with \V\ > k is said to be /^-symmetric if, for every &;-cycle v\ V 2 -A 
• • • — >• Vk -A v\ in G, 



d\2 + ^23 + • • • + dk—l,k + dkl — dik + dk,k — l + ' • ' + ^32 + ^ 21 - 

In other words, a ^-symmetric network is a directed network in which the length 
of every k- cycle remains unchanged if its orientation is reversed. 

(a) Show that the asymmetric traveling salesman problem on a \V\ -symmetric 
network (satisfying the triangle inequality) can be solved via solving a cor- 
responding symmetric traveling salesman problem. In particular, show that 
any heuristic with a fixed worst-case bound for the symmetric traveling sales- 
man problem can be used for the asymmetric traveling salesman problem on 
a | V | -symmetric network to obtain a result with the same worst-case bound. 

( b ) Prove that any 3-symmetric network is /c-symmetric for k = 4, 5, ... , |V|. 

Thus, part (a) can be used if we have a 3-symmetric network. Argue that a 
3-symmetric network can be identified in polynomial time. 



5 

Average- Case Analysis 



5.1 Introduction 

Worst-case performance analysis is one method of characterizing the effectiveness 
of a heuristic. It provides a guarantee on the maximum relative difference between 
the solution generated by the heuristic and the optimal solution for any possible 
problem instance, even those that are not likely to appear in practice. Thus, a 
heuristic that works well in practice may have a weak worst-case performance, 
if, for example, it provides very bad solutions for one (or more) pathological in- 
stance^). 

To overcome this important drawback, researchers have recently focused on 
probabilistic analysis of algorithms with the objective of characterizing the aver- 
age performance of a heuristic under specific assumptions on the distribution of 
the problem data. As pointed out, for example, by Coffman and Lueker (1991), 
probabilistic analysis is frequently quite difficult and even the analysis of simple 
heuristics can often present a substantial challenge. Therefore, usually the anal- 
ysis is asymptotic. That is, the average performance of a heuristic can only be 
quantified when the problem size is extremely large. 

As we demonstrate in Parts II and IV, an asymptotic probabilistic analysis is 
useful for several reasons: 
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1. It can foster new insights into which algorithmic approaches will be effec- 
tive for solving large problems. That is, the analysis provides a framework 
where one can analyze and compare the performance of heuristics on large 
problems. 

2. For problems with fast rates of convergence, the analysis can sometimes 
explain the observed empirical behavior of heuristics for more reasonable- 
size problems. 

3. The approximations derived from the analysis can be used in other models 
and may lead to a better understanding of the tradeoffs in more complex 
problems integrating vehicle routing with other issues important to the firm, 
such as inventory control. 

In this chapter, we present some of the basic tools used in the analysis of the 
average performance of heuristics. Again, we use the bin-packing problem and the 
traveling salesman problem as the “raw materials” on which to present them. 



5.2 The Bin-Packing Problem 

The bin-packing problem provides a very well-studied example for which to demon- 
strate the benefits of a probabilistic analysis. 

Without loss of generality, we scale the bin capacity q so that it is 1 . Consider 
the item sizes vl> 2 , . . . to be independently and identically distributed on 

(0, 1] according to some general distribution <f>. In this section, we demonstrate two 
elegant and powerful techniques that can be used in the analysis of 6 * , the random 
variable representing the optimal solution value on the items w\, W 2 , . . . , w n . The 
first is the theory of subadditive processes and the second is the theory of martingale 
inequalities. 

Subadditive Processes 

Let {a n }, n > 1, be a sequence of positive real numbers. We say that the 
sequence is subadditive if for all n and m, we have a n + a m > a n+m . The following 
important result was proved by Kingman (1976) and Steele (1990), whose proof 
we follow. 

Theorem 5.2.1 If the sequence {a n }, n > 1 is subadditive, then there exists a 
constant 7 such that 

i. 

hm — = 7 . 

n —> 00 n 

Proof. Let 7 = lim n For a given e, select n such that ^ < 7 + e. Since the 

sequence {a n } is subadditive, we have 



Unm fS & n T a n (jn— !)• 
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Making a repeated use of this inequality, we get a nrn < raa n , which implies 

C^nm ^ 

< 7 + e. 

nm 

For any &, 0 < k < n, define i = nm + k. Using subadditivity again, we have 



CLg = CL nrri -\-k — T CL | 

7 CL nm T &Cli 

7 CL nrn T 77/(2i, 



where the second inequality is obtained by repeating the first one k times. Thus, 



^ ^nra+Zc ^ CL nrn T TLCL\ CL nrn CL \ CL\ 

— = < — < 1 < 7 + e H . 

i nm + k nm + k nm m m 

Taking the limit with respect to m, we have 



lim — < 7 + e + lim — = 7 + e. 

£—* 00 £ 00 rn 

The proof is therefore complete since e was chosen arbitrarily. I 

It is clear that the optimal solution of the bin-packing problem possesses a 
subadditivity-like property; that is, for any sets 5, T C N: 



5*(5UT) < b*(S) + 6 *(T), 



where 6 * (5) denotes the optimal solution to the bin-packing problem on a set 
S C TV. Using similar arguments as in the above analysis shows that there exists a 
constant 7 such that the optimal solution to the bin-packing problem 6 * satisfies 

lim — = 7 (a.s.). 

n— Yoo Tl 

In addition, 7 is dependent only on the item size distribution <f>. 



The Uniform Model 



To illustrate the concepts just developed, consider the case where <f> is the uni- 
form distribution on [0,1]. In order to pack a set of n items drawn randomly from 
this distribution, we use the following sliced interval partitioning heuristic with 
parameter r ( SIP(r )). It works as follows. For any fixed positive integer r > 1, the 
set of items N is partitioned into the following 2 r disjoint subsets, some of which 
may be empty: 



N< 






and 






j = l, 2 ,...,r- 1 , 
j = 1 , 2 , . . . , r — 1 . 
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Also, 



and 



N 0 = {fc € N 
N r = {keN 






The number of items in each Nj (respectively, N J ) is denoted by nj (respectively, 
n J ) for all possible values of j. 

Note that for any j = 1,2,. . . , r — 1, one bin can hold an item from Nj together 
with exactly one item from AT- 7 ’. The SIP(r ) heuristic generates pairs of items, one 
item from Nj and one from TV- 7 ', for every j = 1,2,. . . , r — 1. The items in Afo U AT 
are put in individual bins; one bin is assigned to each of these items. 

For any j = 1, 2 , . . . , r — 1, we arbitrarily match one item from Nj with exactly 
one item from AT* 7 " ; one bin holds each such pair. If rij = n J , then all the items in 
Nj U AT are matched. If, however, rij ^ n J , then we can match exactly min {rij, n- 7 } 
pairs of items. The remaining | rij — n J : | items in Nj U N 7 that have not yet been 
matched are put one per bin. Thus, the total number of bins used is 



r— 1 

no + n r + maxjnj, n J }. 
3=1 



The heuristic clearly generates a feasible solution to the bin-packing problem. 



Since 



lim ^ = lim — = 



n—t oo n n— >• oo n 



1 

2r 



(a. 5 .) for all j = 1 , 2 , , r, 



we have 



6 * 1 
lim < lim - 

n— »• oo n n— >• oo n - 



r— 1 



n 0 + n r + >J max{nj , n J } 



i=i 



11 / \ 

5 + 2f (“•»•)• 



Since this holds for any r > 1 , we see that 7 < Since 7 > E(w) (see Exercise 5.4), 
then 7 > | and we conclude that 7 = \ for the uniform distribution on [ 0 , 1 ]. 

Using this idea, we can actually devise an asymptotically optimal heuristic for 
instances where the item sizes are uniformly distributed on [0,1]. To formally 
define this property, let Z* be the cost of the optimal solution to the problem on 
a problem of size n, and let Z ^ be the cost of the solution provided by a heuristic 
H. Let the relative error of a heuristic H on a particular instance of n points be 



e 



H 

n 



Z*n 



Definition 5.2.2 Let T be a probability measure on the set of instances X. A 
heuristic H is asymptotically optimal for T if almost surely 



lim e 

n—> 00 



H 

n 



= 0 , 



where the problem data are generated randomly from W. 
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That is, under certain assumptions on the distribution of the data, H generates 
solutions whose relative error tends to zero as n, the number of points, tends to 
infinity. The above SIP(r) heuristic is not asymptotically optimal since for any 
fixed r, the relative error converges to K 

A truly asymptotically optimal heuristic can easily be constructed. The following 
heuristic is called MATCH. First, sort the items in nonincreasing order of the item 
sizes. Then take the largest item, say item i, and match it with the largest item 
with which it will fit. If no such item exists, then put item i in a bin alone. 
Otherwise, put item i and the item it was matched with in a bin together. Now 
repeat this until all items are packed. The proof of asymptotic optimality is given 
as an exercise (Exercise 5.11). 

An additional use for the bin-packing constant 7 is as an approximation for 
the number of bins needed. When n is large, the number of bins required to 
pack n random items from is very close to ny. How close the random variable 
representing the number of bins is to ny is discussed next. 

Martingale Inequalities 

Consider the stochastic processes {X n } and {Y n } with n > 0. We say that the 
stochastic process {X n } is a martingale with respect to {Y n } if, for every n > 0, 
we have 

(i) E[X n ] < Too, and 
(ii) E[X n+1 \Y u ...,Y n \=X n . 

To get some insight into the definition of a martingale, consider someone playing 
a sequence of fair games. Let X n = Y n be the amount of money the player has at 
the end of the nth game. If {X n } is a martingale with respect to {F n }, then this 
says that the expected amount of money the player will have at the end of the 
(n T l)st game is equal to what the player had at the beginning of that game X n , 
regardless of the game’s history prior to state n. See Karlin and Taylor (1975) for 
details. 

Consider now the random variable 



D n = E[X n + t \Y u . . . , Y n ] - E[X n+1 \Y u . . . , F n _i]. 



The sequence {D n } is called a martingale difference sequence if E[D n ] = 0 for every 
n > 0 . Azuma (1967) developed the following interesting inequality for martingale 
difference sequences; see also Stout (1974) or Rhee and Talagrand (1987). 

Lemma 5.2.3 Let {Lh}, i = 1, 2, . . . , n, be a martingale difference sequence. Then 
for every t > 0, we have 




i<n 



>t] < 2 exp{ -£ 2 /( 2 yy|A||^)}, 

i<n 



where H-D^leo is a uniform upper bound on the Dis. 
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The lemma can be used to establish upper bounds on the probable deviations 
of both 



• 6 * from its mean F7[6*], and 

• ^ from its asymptotic value 7 . 

For this purpose, define 

D _ ( E[b* n \w x , . . . ,Wi\ -E[b*\w 1 ,...,w i - 1 \, if * > 2 ; 

\E[b* n \ Wl ]-E[b* n m, if * = 1 , 

where E[b^\wi, . . . , Wj\ is the random variable that represents the expected optimal 
solution value of the bin-packing problem obtained by fixing the sizes of the first i 
items and averaging on all other item sizes. Clearly, E\b* n \wi , . . . ,w n ] = &*, while 
E[b^ |0] = E[b^\. Hence, Y^i=i = b* n — E[b^\- Furthermore, the sequence D\ 
defines a martingale difference sequence with the property that Di < 1 for every 
i > 1. 

Applying Lemma 5.2.3, we obtain the following upper bound: 



Pr{K-E[b* n ]\>t}=Pr{\j2Di 

i= 1 



> £ j < 2 exp | 




This bound can now be used to construct an upper bound on the likelihood that 
differs from its asymptotic value by more than some fixed amount. 

Theorem 5.2.4 For every e > 0, there exists an integer no such that for all 
n > no, 



Pr{ — — 7 
1 1 n 



>e } <2exp (- ! r)- 



Proof. Theorem 5.2.1 implies that lim n ^ 00 E[-ff] = 7 , and therefore, for every 
e > 0 and k > 2 , there exists no such that for all n > no, we have 





\b*^ 




E 


n 


-7 




. n . 



e 

< k' 



Consequently, 



Pr{ I — -7 > ef < Pri I — — EE1 
U n J U n n 

< Pr i\ b A-?m 

ll n n 
<Pr{\b* n -E[b* r 



EK ] 



n 

f >e j 



> e 



> 



ne{k — 1 ) ' 



k 



< 2 exp | 



ne 2 {k — l ) 2 

2 
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Since this last inequality holds for arbitrary k > 2, this completes the 

proof. I 

These results demonstrate that 6* is, in fact, very close to ny, and this is true 
for any distribution of the item sizes. Therefore, it suggests that ny may serve as 
a good approximation for 6* in other, more complex, combinatorial problems. 



5.3 The Traveling Salesman Problem 

In this section, we demonstrate an important use for the tools presented above. Our 
objective is to show how probabilistic analysis can be used to construct effective 
algorithms with certain attractive theoretical properties. 

Let xi, £ 2 , • • • , x n be a sequence of points in the Euclidean plane (1R 2 ), and let 
L* be the length of the optimal traveling salesman tour through these n points. 
We start with a deterministic upper bound on L* developed by Few (1955). We 
follow Jaillet’s (1985) presentation. 

Theorem 5.3.1 Let a x b be the size of the smallest rectangle that contains 
xi, X 2 . • • , x n ; then 

L* n < y/2(n - 2 )ab + 2 (a + b). 

Proof. For an integer m (to be determined), partition the rectangle of size a x b 
(where a is the length and b is the height) into 2m equal-width horizontal strips. 
This creates 2m + 1 horizontal lines and two vertical lines (counting the boundaries 
of the rectangle). Label the horizontal lines 1 , 2 ,..., 2m+l moving downward. Now 
temporarily delete all horizontal lines with an even label. Connect each point x^, 
i = 1 , 2 ... ,n, with two vertical segments, to the closest (odd-labeled) horizontal 
line. A path through xi, . . . , x n can now be constructed by proceeding from, say, 
the upper left-hand corner of the a x b rectangle and moving from left to right 
on the first horizontal line, picking up all points that are connected (with the two 
vertical segments) to this line. Then we proceed downward and cover the third 
horizontal line from right to left. This continues until we reach the end of the 
2m + 1st line. This path can be extended to a traveling salesman tour by returning 
from the last point to the first by adding at most one vertical and one horizontal 
line (we avoid diagonal movements for the sake of simplicity). Now repeat this 
procedure with the even-labeled horizontal lines and, in a similar manner, create a 
path through all the customers. Extend this path to a traveling salesman tour by 
adding one horizontal line and one vertical segment of length b — See Fig. 5.1. 
Clearly, the sum of the length of the two traveling salesman tours is 

n (2m T 1) -\- — T 2b T a T 2 ( b — — . 

m V mJ 

Since L* is no larger than either of these two tours, we have 

L* a T 2b T ma T (n — 2) . 

n ~ ' 2m 
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1 st tour 



• 2 nd tour 



FIGURE 5.1. The two traveling salesman tours constructed by the partitioning algorithm 
The right-hand side is convex in m; hence, we minimize on m. That is, we choose 



m 



* 



b(n — 2) i 
2 a U 



then 



L* < a + 2b + m*a + 



b(n — 2) 
2m* 



<a + 26 + a(y &(n 2a 2) + l) 



b(n — 2 ) / 2 a 

2 y (n - 2)6 



— y/ 2 (n — 2 )n 6 T 2 (n T 5 ). 



The above result implies that the length of the optimal traveling salesman tour 
is at most 0{y/n). In 1959, Beardwood et al. showed that the rate of growth of L*, 
when customer locations are independent and identically distributed, is 0(y^n). 
Specifically, they prove the following result. 

Theorem 5.3.2 Let ^ 2 , . . . , x n be a sequence of independent random variables 
having a distribution fi with compact support in 1R 2 . Then there exists a constant 
(3 > 0, independent of the distribution /a, such that with probability 1, 

lim = /3 [ f l / 2 (x)dx, 
n^oo y/fi K J M 2 w 

where f is the density of the absolutely continuous part of the distribution pi. 

Since Beardwood et al. proved this result, many researchers have proved it using 
a variety of techniques. One of these methods is based on the concept of Euclidean 
subadditive processes (Steele 1981), which is a generalization of the concept of 
subadditive processes described earlier. 

In this subsection, we are not going to prove the result, but rather concentrate on 
its algorithmic implications. Specifically, we will describe the following polynomial- 
time algorithm, which is asymptotically optimal. The heuristic was suggested by 
Karp (1977) although we have modified it in several places for the purpose of 
clarifying the presentation. 
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FIGURE 5.2. Region-partitioning example with n — 17, q = 3, h = 2, and t — 1 

A Region-Partitioning Heuristic 

In the region-partitioning heuristic, the region containing the points is sub- 
divided into subregions such that each nonempty subregion contains exactly q 
customers (except possibly for one) and where q is to be determined later. The 
heuristic then constructs an optimal traveling salesman tour on the set of points 
within or bordering each subregion and then connects these tours to form a trav- 
eling salesman tour through all the points. 

To generate subregions each with exactly q points, except for possibly one subre- 
gion where there may be fewer points, we use the following strategy: The smallest 
rectangle with sides a and b containing the set of points x\, X 2 , . . . , x n is partitioned 
by means of horizontal and vertical lines. First, the region is divided by t vertical 
lines such that each subregion contains exactly {h + 1 )q points except possibly 
the last one. This is done precisely as follows: Temporarily index the customers in 
increasing order of their horizontal coordinate. Place the vertical lines so that the 
jth vertical line (for j < t ) goes through the customer with index j(h + 1 )q. Each 
of these t + 1 subregions is then partitioned by means of h horizontal lines into 
h- hi smaller subregions such that each contains exactly q points except possibly 
the last one. More precisely, this is done as follows: In each vertical strip, index 
the customers in increasing order of their vertical coordinates. Place the horizontal 
lines so that the j th horizontal line (for j < h) goes through the customer with 
index jq. See Fig. 5.2 for an example. 

To solve the traveling salesman problems within each subregion, we use a dy- 
namic programming algorithm developed by Held and Karp (1962). It finds an opti- 
mal traveling salesman tour through m points in running time, which is 0(m 2 2 m ). 
If we choose q = |~logn~|, then solving the traveling salesman problem for a sin- 
gle region takes 0(nlog 2 n), and since the number of subregions is no more than 
1 + n/logn, the total time spent solving these traveling salesman problems is 
0(n 2 logn). 
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Original 

tour 



Tour obtained after 
using shortcuts 




FIGURE 5.3. The tour generated by the region-partitioning algorithm 



After finding optimal traveling salesman tours within each subregion, observe 
that this collection of traveling salesman tours can be easily transformed into 
a single traveling salesman tour through all the points. This is true since this 
collection of tours, along with the lines added as above, defines an Eulerian graph, 
where the degree of each point (node) is either two or four (a point that is on the 
boundary of two subregions will have degree 4). Thus, this tour can be transformed 
into a single traveling salesman tour, and using shortcuts can further reduce its 
length. See Fig. 5.3. 

To guarantee that each nonempty subregion has exactly q points, except for 
maybe one, h and t must satisfy 



t = 



{h + l)q 



- 1 



and 

t(h + 1 )q < n < (t + 1 ){h + 1 )q. 



This is achieved by choosing h = \ — 1] . 



Let L RP be the length of the tour generated by the above region-partitioning 
heuristic. To establish the quality of the heuristic, we need to find an upper bound 
on L rp ; this is provided by the following. 



Lemma 5.3.3 



L RP c L* + -P RF , 

2 

where P RP is the sum of the perimeters of all subregions generated by the region- 
partitioning heuristic. 



Proof. Let Lj be the length of the optimal traveling salesman tour in subregion 
j = 1 , 2 ,..., . Similarly, let L* be the sum of the lengths of all segments of the 

optimal traveling salesman tour through all n customers that are contained in the 
jth subregion, for j > 1 . Since the collection of tours and lines constructed above 
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defines an Eulerian graph, we have L RP < JULj. Also, by definition, we have 
L* = L*j- Thus, it is sufficient to show that 




where Pj is the perimeter of subregion j. 

To prove inequality (5.1), assume there are exactly k continuous segments Si, , 
Sfc, of the globally optimal traveling salesman tour, in subregion j ; see Fig. 5.4. Let 
the 2 k endpoints of these segments be 2/1 , 7/2, • • • , 2/2 k ordered consecutively around 
the boundary of subregion j. Without loss of generality, we assume that 

^(2/12/2) + % 32 / 4 ) H f- %2fc2/2fc-l) < ^(2/23/3) + ^(2/42/5) + 1 - t(y2kVl), 

where £(yiUi+i) is the distance between points yi and 7/2+ 1 along the perimeter 
of the jth subregion. We construct a feasible solution for the traveling salesman 
problem defined by the points Xi that are in the j th subregion. The tour is based 
on (i) the segments Si, . . . , Sfc, (ii) two copies of each segment 2/12/2? 2 / 3 2 / 4 ? • • • , 
2/2/c— 12 / 2 /c, and (iii) one copy of each segment 2/22/3, 2/42/5, • • • , 2/2/c2/i- 




FIGURE 5.4. The segments Si, . . . , Sk and the corresponding Eulerian graph 

Observe (Fig. 5.4) that the above three components define an Eulerian graph 
whose set of vertices is the points that belong to the j th subregion plus all the 
points yi , for i = 1, 2, . . . , 2fc. This implies that the graph has an Eulerian tour 
whose cost is no more than 



3 
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This tour can be converted into a traveling salesman tour, using shortcuts, and 
therefore, 




Summing these up on j completes the proof. I 

We can now prove the following result due to Karp. 

Theorem 5.3.4 Under the conditions of Theorem 5.3.2, with probability 1, 

P* L rp 

lim —= = lim — 

n—t oo y/n n— OO y/n 



Proof. Lemma 5.3.3 implies 



L* < L rp < L* + -P RP . 



Hence, we need to evaluate the quantity P RP . Note that the number of vertical 
lines added in the construction of the subregions is t < yj Each of these lines is 

counted twice in the quantity P RP . 

In the second step of the RP heuristic, we add h horizontal lines, where h < yj~^. 
These horizontal lines are also counted twice in P RP . It follows that 



P RP < 2 + b) + 2 (a + b) < 2*/—^— (a + b) + 2 (a + 6), 
V q \ log n 

where the right-side inequality is justified by the definition of q. 
Consequently, 



P RP P* 3 P RP 

y/n ~ y/n 2 y/n 

< P* | 3 (q + b) | 3 (q + b) 

_ y/n y/log n y/n 

Taking the limit as n goes to infinity proves the theorem. I 



5.4 Exercises 



Exercise 5.1. A lower bound on /?. Let X(n) = {aq, . . . , x n } be a set of points 
uniformly and independently distributed in the unit square. Let tj be the distance 
from Xj G X{n) to the nearest point in X(n) \ Xj. Let L(X(n)) be the length of 
the optimal traveling salesman tour through X{n). Clearly, E(L(X(n))) > nE(t i). 
We evaluate a lower bound on (3 in the following way. 
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(a) Find Pr(G > £). 

( b ) Use (a) to calculate a lower bound on E{£ i) = J 0 °° Pr(G > £)d£. 

(c) Use Stirling’s formula to approximate the bound when n is large. 

(d) Show that \ is a lower bound on (3. 

Exercise 5.2. An upper bound on f3. (Karp and Steele 1985) The strips method 
for constructing a tour through n random points in the unit square dissects the 
square into ^ horizontal strips of width A, and then follows a zigzag path, visiting 
the points in the first strip in left-to-right order, then the points in the second strip 
in right-to-left order, etc., finally returning to the initial point from the final point 
of the last strip. Prove that when A is suitably chosen, the expected length of the 
tour produced by the strips method is at most l.lGy^. 

Exercise 5.3. Consider the TSP defined on a set of points N indexed 1 , 2, . . . , n. 
Let Z* be the length of the optimal tour. Consider now the following strategy: 
Starting with point 1, the salesman moves to the closest point in the set N \ { 1 }, 
say point 2. The salesman then constructs an optimal traveling salesman tour 
defined on this set of n — 1 points (N \ {1}) and then returns to point 1 through 
point 2. Show that the length of this tour is no larger than 3Z*/2. Is the bound 
tight? 

Exercise 5.4. Prove that the bin-packing constant 7 satisfies 1 < 7 /E{w) < 2, 
where E(w) is the expected item size. 

Exercise 5.5. The harmonic heuristic with parameter M, denoted iL(M), is the 
following. For each k = 1,2, — 1, items of size ^rj-j- < Wi < ^ are packed 

separately, at most k items per bin. That is, items of size greater than | are packed 
one per bin, items of size | < Wi < | are packed two per bin, and so forth. Finally, 
items of size Wi < are packed separately from the rest using first-fit. 

Given n items drawn randomly from the uniform distribution on ( | , 0] , what is 
the asymptotic number of bins used by H{ 5)? 

Exercise 5.6. Suggest a method to pack n items drawn randomly from the 
uniform distribution on [|, 1 ]. Can you prove that your method is asymptotically 
optimal? What is the bin-packing constant ( 7 ) for this distribution? 

Exercise 5.7. Suggest a method to pack n items drawn randomly from the 
uniform distribution on [0, ^]. Can you prove that your method is asymptotically 
optimal? What is the bin-packing constant ( 7 ) for this distribution? 

Exercise 5.8. Suggest a method to pack n items drawn randomly from the uni- 
form distribution on [^, ^]. Can you prove that your method is asymptotically 
optimal? What is the bin-packing constant ( 7 ) for this distribution? 
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Exercise 5.9. (Dreyfus and Law 1977) The following is a dynamic programming 
procedure to solve the TSP. Let city 1 be an arbitrary city. Define the following 
function: 



fi(j, S ) = the length of the shortest path from city 1 to 

city j visiting cities in the set S', where \S\ = i. 

Determine the recursive formula and solve the following instance. 



The distances between cities 
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Exercise 5.10. What is the complexity of the dynamic program developed in 
the previous exercise? 

Exercise 5.11. (Coffman and Lueker 1991) Consider flipping a fair coin n times 
in succession. Let X n represent the random variable denoting the maximum excess 
of the number of heads over tails at any point in the sequence of n flips. It is known 
that E(X n ) is Q(y/n). From this, argue that 

E[Z MA TCH ] = ! +e(v ^). 

Exercise 5.12. Assume n cities are uniformly distributed in the unit disc. Con- 
sider the following heuristic for the n-city TSP. Let di be the distance from city i 
to the depot. Order the points so that d\ < < • • • < d n . For each i = 1, 2, . . . , n, 

draw a circle of radius di centered at the depot; call this circle i. Starting at the 
depot, travel directly to city 1. From city 1, travel to circle 2 in a direction along 
the ray through city 1 and the depot. When circle 2 is reached, follow circle 2 in 
the direction (clockwise or counterclockwise) that results in a shorter route to city 
2. Repeat this same step until city n is reached; then return to the depot. Let 
be the length of this traveling salesman tour. What is the asymptotic rate of 
growth of ? Is this heuristic asymptotically optimal? 
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Mathematical Programming-Based 
Bounds 



6.1 Introduction 

An important method of assessing the effectiveness of any heuristic is to compare 
it to the value of a lower bound on the cost of an optimal solution. In many 
cases, this is not an easy task; constructing strong lower bounds on the optimal 
solution may be as difficult as solving the problem. An attractive approach for 
generating a lower bound on the optimal solution to an MV - Complete problem is 
the following mathematical programming approach. First, formulate the problem 
as an integer program; then relax the integrality constraint and solve the resulting 
linear program. 

What problems do we encounter when we try to use this approach? One difficulty 
is deciding on an integer programming formulation. There are myriad possible 
formulations from which to choose. Another difficulty may be that in order to 
formulate the problem as an integer program, a large (sometimes exponential) 
number of variables are required. That is, the resulting linear program may be 
very large, so that it is not possible to use standard linear programming solvers. 
The third problem is that it is not clear how tight the lower bound provided by 
the linear relaxation will be. This depends on the problem and the formulation. 

In the sections below, we demonstrate how a general class of formulations can 
provide tight lower bounds on the original integer program. In later chapters, 
we show that these and similar linear programs can be solved effectively and 
implemented in algorithms that solve logistics problems to optimality or near 
optimality. 
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6.2 An Asymptotically Tight Linear Program 



Again, consider the bin-packing problem. There are many ways to formulate the 
problem as an integer program. The one we use here is based on formulating it as 
a set-partitioning problem. The idea is as follows. Let F be the collection of all 
sets of items that can be feasibly packed into one bin; that is, 



F = {S C N : w i < !}• 

ies 



For any i e N and S G F, let 



f 1, if i E S, 

[ 0, otherwise. 



Let 

f 1 , if the set of items S are placed in a single bin, 

\ 0, otherwise. 

Then the set-partitioning formulation of the bin-packing problem is the following 
integer program: / 



Problem P : Min E ys 

seF 

s.t. 

E a isys = 1, Vi e N, (6.1) 

SeF 

2/5 G {0,1}, VSeF. 

In this section, we prove that the relative difference between the optimal solution 
of the linear relaxation of Problem P and the optimal solution of Problem P (the 
integer solution) tends to zero as \N\ = n, the number of items, increases. First, 
we need the following definition. 

Definition 6.2.1 A function (j) is Lipschitz continuous of order q on a set A C 1R 
if there exists a constant K such that 

\<j>{x) -4>{y) I < K\x-y\ q , Vx,y £ A. 

Our first result of this section is the following. 

Theorem 6.2.2 Let the item sizes be independently and identically distributed 
according to a distribution <f> ; which is Lipschitz continuous of order q > 1 on 
[0,1]. Let be the value of the optimal solution to the linear relaxation of P, 
and let 6* be the value of the optimal integer solution to P , that is, the value of 
the optimal solution to the bin-packing problem. Then, with probability 1, 

lim — bh p = lim —b* 

n -¥ oo n n— >- oo n 
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To prove the theorem, we consider a related model. Consider a discretized bin- 
packing problem in which there are a finite number W of item sizes. Each dif- 
ferent size defines an item type. Let ni be the number of items of type i, for 
i = 1,2 , . . . , LE, and let n = Y^Li n i be the total number of items. Clearly, 
this discretized bin-packing problem can be solved by formulating it as the set- 
partitioning problem P. To obtain some intuition about the linear relaxation of 
P, we first introduce another formulation closely related to P. 

Let a bin assignment be a vector (ai, • • • , where a$ > 0 are integers, and 
such that a single bin can contain a\ items of type 1, along with a 2 items of type 
2, . . . , along with aw items of type W, without violating the capacity constraint. 
Index all the possible bin assignments 1, 2, . . . , P, and note that R is independent 
of n. The bin-packing problem can be formulated as follows. Let 

Ai r = number of items of type i in bin assignment r, 
for each i = 1, 2, . . . , W and r — 1, 2, . . . , R. Let 

y r = number of times bin assignment r is used in the optimal solution. 

The new formulation of the discretized bin-packing problem is 

R 

Problem Pd : Min E Vr 

r= 1 

S.t. 

R 

^2 VrA-ir > rii, Vi = 1,2, 

r= 1 

y r > 0 and integer, Vr = 1, 2, . . . , R. 

Let b* D be the value of the optimal solution to Problem Pd and let be the 
optimal solution to the linear relaxation of Problem Pd . Clearly, Problem P and 
Problem Pd have the same optimal solution values; that is, b* =b* D . On the other 
hand, b LP is not necessarily equal to 6^ p . However, it is easy to see that any feasible 
solution to the linear relaxation of Problem P can be used to construct a feasible 
solution to the linear relaxation of Problem Pd, and therefore, 

b hP > tig. (6.2) 

The following is the crucial lemma needed to prove Theorem 6.2.2. 

Lemma 6.2.3 



6 lp <b* <b^ + W < b LP + W. 

Proof. The leftmost inequality is trivial, while the rightmost inequality is due 
to (6.2). To prove the central inequality, note that in Problem Pd there are W 
constraints, one for each item type. Let y r , for r = 1,2,...,P, be an optimal 
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solution to the linear relaxation of Problem Pd and observe that there exists such 
an optimal solution with at most W positive variables, one for each constraint. 
We construct a feasible solution to Problem Pd by rounding the linear solution 
up; that is, for each r = 1, 2, . . . , R with y r > 0, we make y r = [|/ r ] , and for each 
r = 1, 2, . . . , R with y r =0, we make y r = 0. Hence, the increase in the objective 
function is no more than W. I 

Observe that the upper bound on b* obtained in Lemma 6.2.3 consists of two 
terms. The first, 6 LP , is a lower bound on &*, which clearly grows with the number 
of items n. The second term (W) is independent of n. Therefore, the upper bound 
on b* of Lemma 6.2.3 is dominated by fr LP ; consequently, we see that for large n, 
b* « 6 lp , exactly what is implied by Theorem 6.2.2. 

We can now use the intuition developed in the above analysis of the discrete 
bin-packing problem to prove Theorem 6.2.2. 

Proof. It is clear that b LP < &*, and therefore, lim n ^oo6 LP /n < lim n ^oo 
b* /n. To prove the upper bound, partition the interval (0, 1] into k > 2 subinter- 
vals of equal length. Let Nj be the set of items whose size w satisfies < w < 
and let \Nj\ = rij, j = 1, 2, . . . , k. We construct a new bin-packing problem where 
item sizes take only the values j = 1, 2, . . . , k — 1 and where the number of 
items of size | is min{nj, n J+ i}, j = 1, 2, . . . , fc — 1. We refer to this instance of 
the bin-packing problem as the reduced instance. For this reduced instance, define 
6*, 6 lp , and 6^ p to be the obvious quantities. 

It is easy to see that we can always construct a feasible solution to the original 
bin-packing problem by solving the bin-packing problem defined on the reduced 
instance and then assigning each of the remaining items to a single bin. This 
results in 



k-l 

b* < 6* + ^2 I n 3 n 3 + 1 1 + n k 

3 = 1 

k-l 

< 6^ P + k + ^ | nj — nj+i | + rik (using Lemma 6.2.3) 

3 = 1 
k-l 

< 6 lp + k + ^ ^ | Tij — nj-\- 1| + n k. 

3=1 

We now argue that 6 LP < b LP . This must be true since every item in the reduced 
instance can be associated with a unique item in the original instance whose size 
is at least as large. Thus, every feasible solution to the linear relaxation of the 
set-partitioning problem defined on the original instance is feasible for the same 
problem on the reduced instance. Hence, 

k-l 

6* < b LP + k + \ n j ~ n i+il + n k- 

3 = 1 
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The strong law of large numbers and the mean value theorem imply that for a 
given j = 1 there exists Sj such that 



lim ^ 

n^oo nk 



where (j) is the density of item sizes. Hence, 



lim — | rij 

n— >-oo Tl 



Consequently, 



n j+l I = ^(Sj) - 4>(Sj+l)\ 



< —K(sj+ 1 — Sj ) 9 (by Lipschitz continuity) 

tv 



< 



k q+1 



K 



(- 



since s J+ i — Sj 



- 



< (since q > 1 ). 

rv 



b * b LP K( 2 k- 1 ) 

lim — < lim 1 ^—5 

^ ^CXD YIj ji — y OO U tv 



Since this holds for arbitrary /c, this completes the proof. I 

In fact, it appears that the linear relaxation of the set-partitioning formulation 
may be extremely close to the optimal solution in the case of the bin-packing 
problem. Chan et al. (1998) show that the worst-case effectiveness of the set- 
partitioning lower bound (the linear relaxation), that is, the maximum ratio of the 
optimal integer solution ( 6 *) to the optimal linear relaxation 6 LP , is |. They also 
provide an example that achieves this bound. That is, for any number of items 
and any set of item weights, the linear program is at least 75% of the optimal 
solution. 



6.3 Lagrangian Relaxation 

In 1971, Held and Karp applied a mathematical technique known as 
Lagrangian relaxation to generate a lower bound on a general integer (linear) 
program. Our discussion of the method follows the elegant presentation of Fisher 
(1981). We start with the following integer program: 

Problem P : Z — Min cx 

s.t. 

Ax = 5, 

Dx < e, 

x > 0 and integer, 



(6.3) 

(6.4) 



104 



6. Mathematical Programming-Based Bounds 



where x is an n- vector, b is an m- vector, e is a k- vector, A is an m x n matrix, and 
D is a k x n matrix. Let the optimal solution to the linear relaxation of Problem P 
be Zlp- The Lagrangian relaxation of constraints (6.3) with multipliers u E M m is 

Problem LR U : Z B (u) = Min cx + u(Ax — b ) 

s.t. 

Dx < e, (6.5) 

x > 0 and integer. 



The following is a simple observation. 

Lemma 6.3.1 For all u E M m , Z B (u) < Z . 

Proof. Let x be any feasible solution to Problem P. Clearly, x is also feasible for 
LR U , and since Z B (u) is its optimal solution value, we get 

Z B (u) < cx + u(Ax — b) = cx. 

Consequently, Z B (u) < Z. I 

Remark: If the constraints Ax = b in Problem P are replaced with the constraints 
Ax < 6, then Lemma 6.3.1 holds for u E 1R™. 

Since Z B (u) < Z holds for all u, we are interested in the vector u that provides 
the largest possible lower bound. This is achieved by solving Problem D, called 
the Lagrangian dual , defined as follows: 

Problem D : Z B — max u Z B (u). 

Problem D has a number of important and interesting properties. 

Lemma 6.3.2 The function Z B (u) is a piecewise linear concave function of u. 

This implies that Z B {u) attains its maximum at a nondifferentiable point. This 
maximal point can be found using a technique called subgradient optimization , 
which can be described as follows: Given an initial vector u° , the method generates 
a sequence of vectors {u k } defined by 

u k+1 =u k + t k {Ax k -5), (6.6) 

where x k is an optimal solution to Problem LR u k and tk is a positive scalar called 
the step size. Polyak (1967) shows that if the step sizes t\, t<z, . . . , are chosen such 
that lim/^oo tk = 0 and J2k>otk is unbounded, then Z B (u k ) converges to Z B . 
The step size commonly used in practice is 

A k (UB-Z B (u k )) 
k EHi (ais*-&i) 2 ’ 



6.4 Lagrangian Relaxation and the Traveling Salesman Problem 105 



where UB is an upper bound on the optimal integer solution value (found using a 
heuristic), aix k — bi is the difference between the left-hand side and the right-hand 
side of the ith constraint in Ax k < b , and A& is a scalar satisfying 0 < A& < 2. 
Usually, one starts with Ao = 2 and cuts it in half every time Z^(u) fails to increase 
after a number of iterations. 

It is now interesting to compare the Lagrangian relaxation lower bound (Zjy) to 
the lower bound achieved by solving the linear relaxation of the set-partitioning 
formulation (Zlp). 

Theorem 6.3.3 



^LP < ^D- 



Proof. 



Zjj = max \ min cx + u(Ax — b) Dx < e, x > 0 and integer > 

U l X ) 

> max \ min cx + u(Ax — b) Dx < e, x > 0 \ 

u lx ) 



= max max \ ve — ub 

U V 






vD < c + uA, v < o| (by strong duality) 
= max \ ve — ublvD < c + uA , v < 0 \ 

u,v l I J 

= min |c?/| Ay = 5, Dy < e, y > o| (by strong duality) 



y 

Z\rp • 



We say a mathematical Program P possesses the integrality property if the solu- 
tion to the linear relaxation of P always provides an integer solution. An inspection 
of the above proof reveals the following corollary. 

Corollary 6.3.4 If Problem LR U possesses the integrality property , then Zd=^lp- 



6.4 Lagrangian Relaxation and the Traveling Salesman 
Problem 

Held and Karp (1970, 1971) developed the Lagrangian relaxation technique in the 
context of the traveling salesman problem. They show some interesting relation- 
ships between this method and a graph-theoretic problem called the minimum- 
weight 1-tree problem. 



106 



6. Mathematical Programming-Based Bounds 



6.4-1 The 1-Tree Lower Bound 



We start by defining a 1-tree. For a given choice of vertex, say vertex 1, a 1-tree is 
a tree having vertex set {2, 3, . . . , n} together with two distinct edges connected to 
vertex 1. Therefore, a 1-tree is a graph with exactly one cycle. Define the weight of 
a 1-tree to be the sum of the costs of all its edges. In the minimum- weight 1-tree 
problem, the objective is to find a 1-tree of minimum weight. Such a 1-tree can be 
constructed by finding a minimum spanning tree on the entire network excluding 
vertex 1 and its corresponding edges and by adding to the minimum spanning tree 
the two edges incident to vertex 1 of minimum cost. 

We observe that any traveling salesman tour is a 1-tree tour in which each 
vertex has a degree 2. Moreover, if a minimum- weight 1-tree is a tour, then it is 
an optimal traveling salesman tour. Thus, the minimum- weight 1-tree provides a 
lower bound on the length of the optimal traveling salesman tour. 

Unfortunately, this bound can be quite weak. However, there are ways to improve 
it. For this purpose, consider the vector i r = {7Ti, 7r2, . . . , 7r n } and the following 
transformation of the distances {dij}: 

d{j — dij T TTi T 7Tj . 

Let L* be the length of the optimal tour with respect to the distance matrix 
{dij}. It is clear that the same tour is also optimal with respect to the distance 
matrix {d[-}. To see that, observe that any traveling salesman tour S of cost L 
with respect to {dij} has a cost L + 2 ^7=1 7Ti respect to {d'-}. Thus, the 
difference between the length of any traveling salesman tour in {d^} and {d'-} is 
constant, independent of the tour. 

Observe also that the above transformation of the distances does change the 
minimum 1-tree. How can this idea be used? First, enumerate all possible 1-trees 
and let d\ be the degree of vertex i in the kth 1-tree. Let Tk be the weight (cost) 
of that 1-tree (before transforming the distances). This implies that the cost of 
that 1-tree after the transformation is exactly 



iev 



Thus, the minimum- weight 1-tree on the transformed distance matrix is obtained 
by solving 



min |r fc + y^TTj j. 



iev 

Since, in the transformed distance matrix, the optimal traveling salesman tour 
does not change while the 1-tree provides a lower bound, we have 



L* + 2 > min j T k + j, 

iev iev 



L* > mm | T k + ^(df - 2)i r* j = 
k iev 



which implies 
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Consequently, the best lower bound is obtained by maximizing the function w(tt) 
over all possible values of 7 r. How can we find the best value of 7 r? Held and Karp 
(1970, 1971) use the subgradient method described in the previous section. That 
is, starting with some arbitrary vector 7 r°, in step k the method updates the vector 
7r k according to 

7r i :+1 = +tk(di - 2), 

where ir k is the ith element in the vector 7r fc , and £&, the step size, equals 

_ X k (UB - w(7T k )) 

EILi(4-2) 2 ‘ 

6.4-2 The 1-Tree Lower Bound and Lagrangian Relaxation 

We now relate the 1-tree lower bound to a Lagrangian relaxation associated with 
the following formulation of the traveling salesman problem. For every e G E, let 
d e be the cost of the edge and let x e be a variable that takes on the value 1 if the 
optimal tour includes the edge and the value 0 otherwise. Given a subset S C V, 
let E(S) be the set of edges from E such that each edge has its two endpoints in 
S. Let S(S) be the collection of edges from E in the cut separating S from V\S. 
The traveling salesman problem can be formulated as follows: 

Problem P' : Z* = Min ^ d e x e 

eEE 

s.t. 

x e = 2, Vi = 1, 2, ,n, (6.7) 

eeS(i ) 

< |S|-1, VSC1/\{1},^0, (6.8) 

eEE(S) 

0 < x e < 1, Ve G E, (6.9) 

x e integer, Ve G E. (6.10) 

Constraints (6.7) ensure that each vertex has an edge going in and an edge 
going out. Constraints (6.8), called subtour elimination constraints, forbid integral 
solutions consisting of a set of disjoint cycles. 

Observe that constraints (6.7) can be replaced by the following constraints: 

y^z e = 2, V* = 1, . . . , n — 1, (6.11) 

eeS(i ) 

5>=n. 

eeE 



( 6 . 12 ) 
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This is true since constraints (6.11) are exactly constraints (6.7) for i = 1, . . . , n— 1. 
The only missing constraint is X^eG<S(n) x e = 2. Therefore, it is sufficient to show 
that (6.12) holds if and only if this one holds. To see this, calculate 

e *<= = ^ e e Xe 

e£E i=l e£d(i) 

i = 1 e£S(i) e£S(n) 

- ( n - 1) + \ E Xe ■ 

e£d(n) 



Thus, J2eeE x e = n if and only if EeeS(n) x e = 2 • 

The resulting formulation of the traveling salesman problem is 



Min 



E 

e£E 



d e x e 



(6.8), (6.9), (6.10), (6.11), and (6.12)}. 



We can now use the Lagrangian relaxation technique described in Sect. 6.3 and 
get the following lower bound on the length of the optimal tour: 



max I min E ^ 

i,jev 



(6.8), (6.9), (6.10), and (6.12)}. 



Interestingly enough, Edmonds (1971) showed that the extreme points of the 
polyhedron defined by constraints (6.8)-(6.10) and (6.12) is the set of all 1-trees; 
that is, the optimal solution to a linear program defined on these constraints must 
be integral. Thus, we can apply Corollary 6.3.4 to see that the lower bound ob- 
tained from the 1-tree approach is the same as the linear relaxation of Problem P' . 



6.5 The Worst-Case Effectiveness of the 1-Tree Lower 
Bound 

We conclude this chapter by demonstrating that the Held and Karp (1970, 1971) 
1-tree relaxation provides a lower bound that is not far from the length of the 
optimal tour. For this purpose, we show that the Held and Karp lower bound can 
be written as follows: 

Problem HK : Zlp = Min ^ d e x e 

e£E 

S.t. 

E Xe = 2 > Vi = 1,2,..., n, 
e£S(i) 



(6.13) 
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2 ’ VSC1/\{1},^0, (6.14) 

ee8(S) 

0<x e <l, Ve G E. (6.15) 

Lemma 6.5.1 The linear relaxation of Problem P' is equivalent to Problem HK. 

Proof. We first show that any feasible solution x to the linear relaxation of Problem 
P' is feasible for Problem HK. Since ^2 eeS oc e < |>S| — 1, See_E(v\s) 

and J2eeE(v) %e = n (why?), we get E e£ 5(5) > 2. 

Similarly, we show that any feasible solution x to Problem HK is feasible for the 
linear relaxation of Problem P' . The feasibility of x in Problem HK implies that 
Eies £ e ei(i) x e = 2|S|. However, 

EE £e= 2 E £e+ E x e = 2\S\, 

ieS ees(i) eeE(S) eeS(S) 

and since E ee <5(S) Xe - 2 ’ we § et E e eE(S) x e < |5| - 1. I 

Shmoys and Williamson (1990) have shown that the Held and Karp lower bound 
(Problem HK) has a particular monotonicity property, and as a consequence, they 
obtain a new proof of an old result from Wolsey (1980), who showed the following: 

Theorem 6.5.2 For every instance of the TSP for which the distance matrix 
satisfies the triangle inequality , we have Z* < | Zlp. 

The proof presented here is based on the monotonicity property established 
by Shmoys and Williamson (1990) . However, we use a powerful tool discovered 
by Goemans and Bertsimas (1993), called the parsimonious property. This is a 
property that holds for a general class of network design problems. 

To present the property, consider the following linear program defined on the 
complete graph G = (V, E). Associated with each vertex i E V is a given number 
r^, which is either 0 or 2. Let V 2 = {i G V\ = 2}. 

We will analyze the following linear program (here ND stands for network 
design) . 

Problem ND : Min d e x e 

eeE 

s.t. 

y^z e =rj, Vi = 1,2, (6.16) 

ee8(i) 

^2 x e > 2, VS cv,v 2 ns ^ 0 , 

eeS(S) 

V 2 n (V\S) ^ 0, (6.17) 

0 < x e < 1, Ve e E. (6.18) 

It is easy to see that when V 2 = V, this linear program is equivalent to the linear 
program Problem HK. We now provide a short proof of the following result. 
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Lemma 6.5.3 The optimal solution value to Problem ND is unchanged if we omit 
constraint (6.16). 

Our proof is similar to the proof presented in Bienstock and Simchi-Levi (1993); 
see also Bienstock et al. (1993a), which uses a result of Lovasz (1979). In his book 
of problems (Exercise 6.51), Lovasz presents the following result, together with a 
short proof. But first, we need a definition. 

Definition 6.5.4 An undirected graph G is /^-connected between two vertices i 
and j if there are k (node) disjoint paths between i and j. 

Lemma 6.5.5 Let G be an Eulerian multigraph and s E V(G), such that G is 
k- connected between any two vertices different from s. Then , for any neighbor u 
of s, there exists another neighbor w of s, such that the multigraph obtained from 
G by removing {s,^} and {s,re} and adding a new edge {u,w} (the splitting-off 
operation) is also k-connected between any two vertices different from s. 

Lovasz’s proof of Lemma 6.5.5 can be easily modified to yield the 
following. 

Lemma 6.5.6 Let G be an Eulerian multigraph , Y C V{G ) and s E V(G), such 
that G is k-connected between any two vertices of Y different from s. Then, for 
any neighbor u of s, there exists another neighbor w of s, such that the multigraph 
obtained from G by removing { 5 , u} and { s , w} and adding a new edge { u , w} is 
also k-connected between any two vertices ofY different from s. 

We can now prove Lemma 6.5.3. 

Proof. Let Vo = W\ V 2 ; that is, Vo = {i E V\ri = 0}. Let Problem ND' be Problem 
ND without (6.16). Finally, let x be a rational vector feasible for Problem ND', 
chosen such that (i) x is optimal for Problem ND', and (ii) subject to (i), ^f eeE x e 
is minimized. 

Let M be a positive integer, large enough so that v = 2 Mx is a vector of even 
integers. We may regard v (with a slight abuse of notation) as the incidence vector 
of the edge-set E of a multigraph G with vertex set V. Clearly, G is Eulerian, and 
by (6.17), it is 4M-connected between any two elements of V 2 . 

Now suppose that for some vertex s , ^ e e<5({s}) > Vs s ^ as a degree 

larger than 2 Mr s in G). Let us apply Lemma 6.5.6 to s and any neighbor u of s 
(where Y = V 2 ), and let H be the resulting multigraph, with incidence vector 5. 



Clearly, 




eEE 



eEE 




e£E 



e£E 



and so 
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Moreover, 




Hence, by the choice of x, z = cannot be feasible for Problem NDb 

If s £ Vo, then by Lemma 6.5.6, z is feasible for Problem ND'. Thus, we must 
have s E V 2 and, in fact, ^ e e<5(-p}) = 0 for allt £ Vo- In other words, E spans 

precisely V 2 , G is 4M-connected, and X^ e e<5({s}) — 4M + 2. But we claim now that 
the multigraph H is 4M-connected. For by Lemma 6.5.6, it could only fail to be 
4M-connected between s and some other vertex, but the only possible cut of size 
less than 4 M is the one separating s from K\{s}. Since this cut has at least 4M 
edges, the claim is proved. Consequently, again we obtain that z is feasible for 
Problem ND', a contradiction. In other words, ^2 eeE v e = 2 Mri for all i; that is, 



An immediate consequence of Lemma 6.5.3 is that in Problem HK, one can 
ignore constraint (6.13) without changing the value of its optimal solution. This 
new formulation reveals the following monotonocity property of the Held and Karp 
lower bound: Let A CV and consider the Held and Karp lower bound on the length 
of the optimal traveling salesman tour through the vertices in A; that is, 



Since any feasible solution to Problem HK(K) is feasible for Problem HK(A), the 
cost of this linear program is monotone with respect to the set of nodes A. 

We are ready to prove Theorem 6.5.2. 

Proof. Section 4.3.3 presents and analyzes the heuristic developed by 
Christofides for the TSP which is based on constructing a minimum spanning 
tree plus a matching on the nodes of odd degree. Observe that a similar heuristic 
can be obtained if we start from a 1-tree instead of a minimum spanning tree. 
Thus, the length of the optimal tour is bounded by W{T ’*) + W(M*(A)), where 
W(T 1 *) is the weight (cost) of the best 1-tree and W(M*(A)) is the weight of 
the optimal weighted matching defined on the set of odd-degree nodes in the best 
1-tree, denoted by A. 

We argue that W(M*(A)) < \ Zlp{A ). Let x be an optimal solution to Problem 
HK (A). It is easy to see that the vector | x is feasible for the following constraints: 



(6.16) holds. 



Problem HK(A) : Zlp(A) = Min d e x e 

eeE 

s.t. 



%e > 2, VS C A, (6.19) 



eeS(S) 

0<x e <l, VeeE. (6.20) 




eES(i) 



( 6 . 21 ) 
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x e < ^(l S 'l - !). V5c A,s^0, |5| >3, |5| is odd, (6.22) 



eeE(S) 



0 < x e < 1, Ve G E. 



(6.23) 



A beautiful result of Edmonds (1965) tells us that these constraints are sufficient 
to formulate the matching problem as a linear program. Consequently, 



6.6 Exercises 

Exercise 6.1. Prove Lemma 6.3.2. 

Exercise 6.2. Show that a lower bound on the cost of the optimal traveling 
salesman tour can be given by 



where N is the set of cities and dij is the distance from city i to city j. 

Exercise 6.3. Consider an instance of the bin-packing problem where there are 
rrij items of size Wj G (0, 1] for j = 1, 2, . . . , n. Define a bin configuration to be 
a vector c = (ci, C 2 , . . . , c n ) with the property that c* > 0 for i = 1, 2, . . . , n and 
Y^j=i c j w j — 1- Enumerate all possible bin configurations. Let there be M such 
configurations. Define Cjk to be the number of items of size Wj in bin configuration 
fc, for k = 1, 2, . . . , M and j = 1, 2, . . . , n. 

Formulate an integer program to solve this bin-packing problem using the follow- 
ing variables: Xk is the number of times configuration k is used, for k = 1,2,..., M. 

Exercise 6.4. A function u : [0, 1] [0, 1] is dual-feasible if, for any sets of 

numbers w \ , W 2 , . . . , Wk , we have 



W(M*(A)) < l -Z LP (A) < l -Z LP {V ) = l -Z LP , 



and therefore 



L* < W(T*) + W (M* (A)) 



< Zlp + - Z LP 



- \ Zlp • 




k 



k 



m < i yu(wj) < i. 



i= 1 



i= 1 
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(a) Given an instance of the bin-packing problem with item sizes wi,W 2 , 
. . . ,w n and a dual-feasible function u , prove that 227=1 u ( w 2 — 

( b ) Assume n is even. Let half of the items be of size | and the other half of size 

Find a dual-feasible function u that satisfies 

n 

= b*. 

i = 1 

Exercise 6.5. Consider a list L of n items of sizes in (|, |]. Let b LF be the 
optimal fractional solution to the set-partitioning formulation of the bin-packing 
problem, and let 6* be the optimal integer solution to the same formulation. Prove 
that 

b * < b LP + 1. 

Exercise 6.6. Prove that if a graph has exactly 2 k vertices of odd degree, then 
the set of edges can be partitioned into k paths such that each edge is used exactly 
once. 



Part II 



Inventory Models 



7 

Economic Lot Size Models 
with Constant Demands 



7.1 Introduction 

Production planning is also an area where difficult combinatorial problems appear 
in day-to-day logistics operations. In this chapter, we analyze problems related 
to lot sizing when demands are constant and known in advance. Lot sizing in 
this deterministic setting is essentially the problem of balancing the fixed costs of 
ordering with the costs of holding inventory. In this chapter, we look at several 
different models of deterministic lot sizing. First, we consider the most basic single- 
item model, the economic lot size model. Then we look at coordinating the ordering 
of several items with a warehouse of limited capacity. Finally, we look at a one- 
warehouse multiretailer system. 

7.1.1 The Economic Lot Size Model 

The classical economic lot size model, introduced by Harris (1915) (see Erlenkotter 
1990 for an interesting historical discussion), is a framework where we can see the 
simple tradeoffs between ordering and storage costs. Consider a facility, possibly 
a warehouse or a retailer, that faces a constant demand for a single item and 
places orders for the item from another facility in the distribution network, which 
is assumed to have an unlimited quantity of the product. The model assumes the 
following. 
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FIGURE 7.1. Inventory level as a function of time 

• Demand is constant at a rate of D items per unit time. 

• Order quantities are fixed at Q items per order. 

• A fixed setup cost K is incurred every time the warehouse places an order. 

• A linear inventory carrying cost h , also referred to as the holding cost , is 
accrued for every unit held in inventory per unit time. 

• The lead time, that is, the time that elapses between the placement of an 
order and its receipt, is zero. 

• The initial inventory is zero. 

• The planning horizon is infinite. 

The objective is to find the optimal ordering policy minimizing the total purchasing 
and carrying cost per unit of time without shortage. 

Like all models, this is a simplified version of what might actually occur in prac- 
tice. The assumption of a known fixed demand over the infinite horizon is clearly 
unrealistic. The lead time is most likely positive, and the requirement of a fixed 
order quantity is restrictive. As we shall see, all these assumptions can be easily 
relaxed while maintaining a relatively simple optimal policy. For the purposes of 
understanding the basic tradeoffs in the model, we keep the assumptions listed 
above. 

It is easy to see that an optimal ordering policy must satisfy the zero-inventory- 
ordering property , which says that every order is received precisely when the in- 
ventory level drops to zero. This can be seen by considering the case where an 
order is placed when the inventory level is not zero. In that case, the cost is not 
increased if we simply wait until the inventory is zero to order. 

To find the optimal ordering policy in the economic lot size model, we con- 
sider the inventory level as a function of time (see Fig. 7.1). This is the so-called 
saw-toothed inventory pattern. We refer to the time between two successive re- 
plenishments as a cycle time. Thus, the total inventory cost in a cycle of length 
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T is 



K + 



hTQ 
2 ’ 



and since Q = TD, the average total cost per unit of time is 



KD hQ 
~Q~ + ~2~‘ 



Hence, the optimal order quantity is 



Q* 



2 KD 



This quantity is referred to as the economic order quantity (EOQ); it is the quan- 
tity at which the inventory setup cost per unit of time ( ) equals the inventory 

holding cost per unit of time ( ^ ) . 

We now see how some of our assumptions can be relaxed, without losing any of 
the model’s simplicity. Consider the case in which the initial inventory is positive, 
say at level Jo; then the first order for Q* items is simply delayed until time 
Further, the assumption of zero lead time can also be easily relaxed. In fact, the 
model can handle any deterministic lead time L. To do this, simply place an order 
for Q* items when the inventory level is DL. On the other hand, relaxing the 
assumptions of fixed demands and infinite planning horizon requires significant 
changes to the above solution. 



7.1.2 The Finite- Horizon Model 



To make the model more realistic, we now introduce a finite horizon, say t. For 
instance, in the retail apparel industry, such a horizon may represent an 8-12- week 
period, for example, the “winter season,” in which demand for the product might 
be assumed to be constant and known. We also relax the assumption that the 
order quantities are fixed. We seek an inventory policy on the interval [0, t] that 
minimizes the ordering and carrying costs. 

For this purpose, consider any inventory policy, say, P, that places m > 1 orders 
in the interval [0, t\. Clearly, the first order must be placed at time zero and the last 
must be placed so that the inventory at time t is zero. For any i, 1 < i < m — 1, let 
Ti be the time between the placement of the ith order and the (i+ l)st order and let 
T m be the time between the placement of the last order and t. Thus, by definition, 
t = YliLi T, and V places the jth order at time X^=i T, for 1 < j < rn. Again, 
it is clear that the policy V must satisfy the zero-inventory-ordering property. 
Figure 7.2 illustrates the inventory level of policy V. 

For policy V, let I{r) be the inventory level at time r G [0, t] . Thus, the total 
cost per unit of time associated with V is 



1 - 
t . 



Km + h 
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FIGURE 7.2. Inventory level as a function of time under policy V 



The only thing we know about the function I (r) is that it decreases at a rate of D 
(a slope of —D) between orders and reaches zero exactly ra times. Thus, we can 
express the total inventory up to time t as a function of the time between orders 
{Ti}i = as follows: 



Ti • DTi 

hi ^ 



D 

~2 



m 



EU 



Consequently, if ra orders are placed, we can find the best times to place them by 
solving 

m m 

Min | Zf | Ti = £, Ti > 0, Vi = 1, 2, ... , raj. 

i= 1 i=l 

The optimal solution to this convex optimization problem is Ti = ^ for each 
i = 1,2,. . . , ra. Hence, an optimal policy must have the following property. 



Property 7.1.1 For a problem with one product over the interval [0, t\, the inven- 
tory policy with minimum cost that places ra orders is achieved by placing orders 
of equal size at equally spaced points in time. 



The property thus implies that the total purchasing and carrying cost per unit 
time associated with V is at least 

Km hDt 
t 2 ra 

Consequently, by selecting the value of ra that minimizes this value, we can con- 
struct a policy of minimal cost. Let 




and thus the best value of ra is either or |"aT|, depending on which yields the 
smaller cost. Thus, our policy in the finite-horizon case is, in fact, very similar to 
the infinite-horizon case. Orders are placed at regularly spaced intervals of time, 
and, of course, the orders are of the same size each time. 
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7.1.3 Power-of-Two Policies 



Consider the infinite- horizon model described in Sect. 7.1. For this model, we know 
that the average total cost per unit of time is 

KD x hQ _K , hTD ^ 

Q 2 T 2 I[ 

where T is the time between orders. In this subsection, following Muckstadt and 
Roundy (1993), we introduce a new class of policies called power-of-two policies. 

To simplify the analysis, and in accordance with the notation used in the lit- 
erature (see Roundy 1985 and Muckstadt and Roundy 1993), let g = and 
hence 

f(T) = *+gT. 

Observe that the function f(T ) motivates another interpretation of the model. 
We can consider the problem to be an economic lot size model with unit demand 
rate, that is, D = 1, and inventory holding cost 2 g. The optimal reorder interval 
is T* = y^, and the total cost per unit time is f(T*) = 2 \/Kg. 

One difficulty with the economic lot size model is that the optimal reorder 
interval T* may take on any value and thus might lead to highly impractical 
optimal policies. For instance, reorder intervals of days, or y/n weeks, would 
not be easy to implement. That is, the model might specify that orders be placed on 
Monday of one week, Thursday of the next, Tuesday of the next week, and so forth, 
a schedule of orders that may not have an easily recognizable pattern. Therefore, 
it is natural to consider policies where the reorder interval T is restricted to values 
that would entail easily implementable policies. One such restriction is termed the 
power-of-two restriction. In this case, T is restricted to be a power-of-two multiple 
of some fixed base planning period Tb ; that is, 



T = T B 2 k , k e {0, 1,2,3, . . .}. 



(7.1) 



Such a policy is called a power-of-two policy. The base planning period Tb may 
represent a day, week, or month, for example, and is usually fixed beforehand. It 
represents the minimum possible reorder interval. 

Restricting ourselves to power-of-two policies requires addressing the following 
issues. 

• How does one find the best power-of-two policy, the one minimizing the cost 
over all possible power-of-two policies? 

• How far from optimal is the best policy of this type? 

We start by answering the first question. Let T* = be the optimal (unre- 
stricted) reorder interval, and let T be the optimal power-of-two reorder interval. 
Since / is convex, the optimal k in (7.1) is the smallest integer k satisfying 

f(T B 2 k ) < f(T B 2 k+1 ) 
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Hence, k is the smallest integer such that 




Thus, finding the optimal power-of-two policy is straightforward. 

Observe that by the definition of the optimal fc, it must also be true that 



and hence the optimal power-of-two policy, for a given base planning period Tg, 
must be in the interval [^T*, y/2T*]. It is easy to verify that 



Consequently, the average inventory purchasing and carrying cost of the best 
power-of-two policy is guaranteed to be within 6 % of the average cost of the 
overall minimum policy. The reader can see that this property is a result of the 
“flatness” of the function / around its minimum. 

This restriction, to power-of-two multiples of the base planning period, will also 
prove to be quite useful later in a more general setting. 

7.2 Multi-Item Inventory Models 

7 . 2. 1 Introduction 

The previous models established optimal inventory policies for single- item models. 
It is simple to show that without the presence of joint order costs, a problem 
with several items each facing a constant demand can be handled by solving each 
item’s replenishment problem separately. In reality, the management of a single 
warehouse inventory system involves coordinating inventory orders to minimize 
cost without exceeding the warehouse capacity. The warehouse capacity limits the 
total volume held by the warehouse at any point in time. This constraint ties 
together the different items and necessitates careful coordination (or scheduling) 
of the orders. That is, it is important to know not only how often an item is 





and hence, since / is convex, we have 
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ordered, but also exactly the point in time at which each order takes place. This 
problem is called the economic warehouse lot scheduling problem (EWLSP). The 
scheduling part, hereafter called the staggering problem, is exactly the problem 
of time-phasing the placement of the orders to satisfy the warehouse capacity 
constraint. Unfortunately, this problem has no easy solution, and consequently it 
has attracted a considerable amount of attention in the last three decades. 

The earliest known reference to the problem appears in Churchman et al. (1957) 
and subsequently in Holt (1958) and Hadley and Whitin (1963). These authors 
were concerned with determining lot sizes that made an overall schedule satisfy 
the capacity constraint, and not with the possibility of phasing the orders to avoid 
holding the maximum volume of each item at the same time. Thus, they only 
considered what are called independent solutions, wherein every item is replenished 
without any regard for coordination with other items. 

Several authors considered another class of policies called rotation cycle policies 
wherein all items share the same order interval. Homer (1966) showed how to 
optimally time-phase (stagger) the orders to satisfy the warehouse constraint for 
a given common order interval. Page and Paul (1976), Zoller (1977), and Hall 
(1988) independently rediscovered Homer’s result. At the end of his paper devoted 
to rotation-cycle policies, Zoller indicates the possibility of partitioning the items 
into disjoint subsets, or clusters, if the assumption of a rotation policy “proves 
to be too restrictive.” This is precisely Page and Paul’s partitioning heuristic. 
In their heuristic, all the items in a cluster share a common order interval. The 
orders are then optimally staggered within each cluster, but no attempt is made to 
time-phase the orders of different clusters. Goyal (1978) argued that such a time- 
phasing across the different clusters may lead to a further reduction in warehouse 
space requirements. Hartley and Thomas (1982) and Thomas and Hartley (1983) 
considered the two-item case in detail. 

A number of studies have been concerned with the strategic version of the 
EWLSP in which the warehouse capacity is not a constraint but rather a deci- 
sion variable. These include Hodgson and Howe (1982), Park and Yun (1985), 
Hall (1988), Rosenblatt and Rothblum (1990), and Anily (1991). In this model, 
the inventory carrying cost consists of two parts; one part is proportional to the 
average inventory, while the second part is proportional to the peak inventory. A 
component of the latter cost, discussed in Silver and Peterson (1985), is the cost 
of leasing the storage space. This cost is typically proportional to the size of the 
warehouse, and not to the inventories actually stored in it. 

Define a policy to be a stationary order size policy if all replenishments of an item 
are of the same size. Likewise, a stationary order intervals policy has all orders for 
an item equally spaced in time. It is easily verified that an optimal stationary order 
size (respectively, stationary order interval) policy is also a stationary order interval 
(respectively, a stationary order size) policy if every order of an item is received 
precisely when the inventory of that item drops to zero; that is, it also satisfies 
the zero-inventory-ordering property. Thus, it is natural to consider policies that 
have all three properties: stationary order size, stationary order interval, and zero 
inventory ordering. We call such policies stationary order size and interval policies, 
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in short, SOS I policies. Two “extreme” cases of SOSI policies are the independent 
solutions and the rotation-cycle policies defined above. All the authors cited above 
considered SOSI policies exclusively. Zoller claims that SOSI policies are the only 
rational alternative, and most authors agree that SOSI policies are much easier 
to implement in practice. In his Ph.D. thesis, however, Hariga (1988) investigated 
both time- variant and stationary order sizes. He was motivated to study time- 
variant order sizes by their successful application in resolving the feasibility issue 
in the economic lot scheduling problem (ELSP) (see Dobson 1987). 

Andy’s paper departs from earlier work on the EWLSP in its focus on the 
worst-case performance of heuristics. In her paper, Anily restricts herself to the 
class of SOSI policies for the strategic model. She proves lower bounds on the 
minimum-required warehouse size and on the total cost for this class of policies. 
She presents a partitioning heuristic of which the best independent solution and 
the best rotation-cycle policies are special cases. This partitioning heuristic is 
similar to the one proposed by Page and Paul for the tactical model although the 
precise methods for finding the partition are different. Anily proves that the ratio 
of the cost of the best independent solution to her lower bound is at most y/2. She 
also provides a data-dependent bound for the best rotation cycle, derived from 
Jones and Inman’s (1989) work on the economic lot size problem. As a result, 
her partitioning heuristic is at least as good as either special case and thus has a 
worst-case bound of y/2 relative to SOSI policies. 

In this section, we determine easily computable lower bounds on the cost of the 
EWLSP as well as some simple heuristics for the problem. These bounds are used 
to determine the worst-case performance of these heuristics on different versions 
of the problem. First, in Sect. 7.2.2, we introduce notation, state assumptions, and 
formally define the strategic and tactical versions of the EWLSP. In Sect. 7.2.3, 
we establish the worst-case results. The discussion in this section is based on the 
work of Gallego et al. (1996). 

1.2.2 Notation and Assumptions 

Let N = {1, 2, . . . , n} be a set of n items each facing a constant unit demand rate 
(this can be done without loss of generality). An ordering cost K{ is incurred each 
time an order for item i is placed. A linear holding cost 2 hi is accrued for each 
unit of item i held in inventory per unit of time. Demand for each item must be 
met over an infinite horizon without shortages or backlogging. 

The volume of inventory of item i held at a given point in time is the product 
of its inventory level at that time and the volume usage rate of item i, denoted by 

> 0. The volume usage rate is defined as the volume displaced by one unit of 
item i. Without loss of generality, we select the unit of volume so that Y^i = i 7 z = 1. 

The objective in the strategic version of the EWLSP is to minimize the long- 
run average inventory carrying and ordering cost plus a cost proportional to the 
maximum volume held by the warehouse at any point in time. Formally, for any 
inventory policy P, let V(V) denote the maximum inventory volume held by the 
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warehouse, and let C{V) be the long-run average inventory carrying and holding 
cost incurred by this policy. Then the objective is to find a policy V minimizing 

Z(P)±C(P) + V(P). 

The tactical version of the EWLSP has also received much attention in the lit- 
erature. There, the objective is to find a policy V minimizing the long-run average 
inventory carrying and holding costs subject to the inventory always being less 
than the warehouse capacity. Hence, the tactical version can be formulated as fol- 
lows: Find a policy V minimizing C(P) subject to V(P) < u, where v denotes the 
available warehouse volume. 



7.2.3 Worst- Case Analyses 

Preliminaries 

We present here two simple results that are used in subsequent analyses. 

Given a SO SI policy, let T = {Ti, X 2 , . . . , T n } be the vector of reorder intervals, 
where Ti is the reorder interval of item i. For any such vector T, let V(T) denote 
the maximum volume of inventory held by the warehouse over all points in time. 
The following provides a simple upper bound on V(T). 

Lemma 7.2.1 For any vector T = {Xi, X 2 , . . . , T n }, we have 

n 

i= 1 

Proof. Clearly, the inventory level of item i, at any moment in time, is no more 
than Ti (recall demand is 1 for alH). I 

For the next result, we need some additional notation. Consider any inventory 
policy V and any time interval [0 ,£]. Let V(P,t) be the maximum inventory held 
by the warehouse in policy V over the interval [0 , t] , and let C(V,t) be the average 
inventory holding and carrying cost incurred over [0, t). Let m{ be the number of 
times the warehouse places an order for item i over the interval [0, t\. For r G [0, £], 
let Ii(r) be the inventory level of item i at time r. Let Vi(r) be the volume of 
inventory held by item i at time r; that is, Vi(r) = 7 Also, let v{r) = 
Y^=i v i( T ) be the volume of inventory held by the warehouse at time r. 

Lemma 7.2.2 For any inventory policy V and time interval [0 ,£], we have 

1 n j. n 1 rt 

U(r)dr< v (V.t). 

i= 1 1 i= 1 Jr ~ u 
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Proof. Clearly, v(r) < V(V,t) for all r < t. Taking the integral up to time t > 0 
gives 



V(V,t) > - 



7 [ 'ff v i( T ) d7 
t Jt = o , 



1 

t 



J T = 0 



)d 



T 



= X^7 7 */ hir)dr 

4 1 Jr = 0 



> 



y- 1 7 it 

2 rrii ’ 



where the last inequality follows from Property 7.1.1, which states that when rrii 
orders for a single item are placed over the interval [0,£], the average inventory 
level is minimized by placing equal orders at equally spaced points in time. I 



The Strategic Model 

Consider the following heuristic for the strategic version of the EWLSP. Use the 
vector of reorder intervals T that solves 

zH = ( w + hiTi ) + T! 7^}. 

i 1 i 

Clearly, the vector T can be found in 0(n) time by solving n separate economic 
lot scheduling models, and 



Z H = (7-2) 

By Lemma 7.2.1, Z H must provide an upper bound on the optimal solution value 
of the strategic model. 

We now construct a lower bound on the optimal solution value over all possible 
inventory policies. The lower bound is the cost of the optimal policy if the ware- 
house cost were based on average inventory rather than maximum inventory. This 
bound will be used to prove the worst-case result. 

Lemma 7.2.3 A lower bound on the optimal solution value over all possible in- 
ventory strategies is given by 

Z iB =2]T v /iC(/ li +7 i /2). (7.3) 



Proof. We show that Z LB < C( V,t) + V(V,t) for all possible inventory policies 
V and for allt > 0. Given an inventory policy “P, where mi orders for item i are 
placed over a time interval [0, t] , then 
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C(V, t) = j ^ (rmKi + 2hi J ijfyjdr). 



Combining this cost with the lower bound obtained in Lemma 7.2.2 on V(V,t) 
yields the following lower bound on C(V,t) + V(V,t): 



C(V,t) + V(V,t) > \ V fm^+2/j, f Ii(r)dT 1 + 1 

^ ^ L j T=0 J ^ i Jt = 0 

2 ^ r rt 

= T V" + (2hi +7z) / Ii(r )di 

t • L ?r=0 



> 



E [*,(=) + M(±) 



The last inequality again follows from Property 7.1.1. Minimizing the last expres- 
sion with respect to — for each i e N proves the result. I 

Tfli 

We now show that this heuristic is effective in terms of worst-case performance. 

Theorem 7.2.4 



Z H 

Z LB 



< y/2. 



Proof. Combining (7.2) and (7.3), we get 

= 2E, y/Kj(hj + 7i) ^ 

Z LB 2E i V^i(/ii+7i/2) "" 



Can this bound be improved? The following example shows that the bound is 
tight as the number of items grows to infinity. Consider an example n items with 
Ki = K, hi = 0 and 7^ = 7 = ^ for all i G N. Clearly, 

Z H = 2nv^7- 



We now construct a feasible solution whose cost approaches the lower bound Z LB 
as n goes to infinity. Consider a feasible policy V with identical reorder intervals 
denoted by T. To reduce the maximum volume V(T), we stagger the orders such 
that item i is ordered at times T [ + k] for k > 0. Then the maximum volume 
of inventory is T7. Hence, the cost of policy V is 



Z(V) 



nK 

f 



n + 1 
2 



f 7 . 



Minimizing with respect to T gives 

Z(V) = y/2n(n + 
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Consequently, 

Z H > Z H 2n^K^ 

Z LB Z(V) n(n + l)iCy 

The limit of this last quantity is \/2 (as n goes to infinity); hence, along with 
Theorem 7.2.4, we see that an example can be constructed where the worst-case 
ratio is arbitrarily close to \[2. 



The Tactical Model 

For the tactical version of the EWLSP, a simple heuristic denoted HW first 
proposed by Hadley and Whitin (1963) is to solve 

Problem P HW : (j HW = y[{ n {hiTi + 

i 1 

S.t. 

Z 1i T i ^ V ) 

i 

T > 0. 



We show that the HW heuristic has a worst-case performance bound of 2 with re- 
spect to all feasible policies. We do so by proving that the solution to the following 
nonlinear program provides a lower bound on the cost of any feasible policy. 

Problem P LB : C LB = Min Z (h & + Li) 

i 1 

S.t. 

(7-4) 

i 

T > 0. 

Lemma 7.2.5 C LB is a lower bound on the cost of any feasible inventory policy. 



Proof. Consider any feasible policy V over the interval [0, t\ that places mj orders 
for item i in [0, t] . From Lemma 7.2.2, we have Vt > 0, 



>V(V,t)>\Z 



li 



The average inventory holding and carrying cost incurred over the interval [0, t\ is 
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Again, the last inequality follows from Property 7.1.1. 

Thus, by replacing A- with T{ for alii > 1, we see that minimizing (7.5) subject 
to \ JA 7 it/rrii < v provides a lower bound on C(V, t). I 

We now prove the worst-case bound. 

Theorem 7.2.6 

(jHW 



Proof. Let T LB = {T BB ,T 2 LjB , . . . ,T BB } be the optimal solution to P LB . Obvi- 
ously, T[ = \T BB is feasible for p HW . Hence, 



C 



HW 



< 






1 

2 



E h ^: R 



Ki 






LB 



< 2 C LB . 



As in the strategic version, the worst-case bound provided by the above theorem 
can be shown to be tight. To do so, consider the case where all items are identical 
with Ki = K, hi = 0 and 7* = 7 = ^ for all i G N. The solution to problem P HW 
is clearly Ti = v for all i G TV, so C HW = Consider now a feasible policy 

V with identical reorder intervals denoted by T such that an order for item i is 
placed at times T[ ^~ 1 ^ > + k\ for k > 0. The maximum volume occupied by policy 

V is ( n + 1 ) j’ 7 . So T = is feasible and C(V) = K • Hence, 

Um 9^L = i im _ 2 

»->oo C{V) «. K(n + l)/2v 

By performing a similar analysis, one can obtain worst-case bounds on the per- 
formance of heuristics for other versions of the EWLSP. For instance, for the joint 
replenishment version of the strategic model, where an additional setup cost Kq is 
incurred whenever an order for one or more items is placed, the worst-case bound 
of a heuristic, similar to the one described for the EWLSP, can be shown to be 
y/3. The worst-case bound on the tactical version of the joint replenishment model 
can be shown to be 2y/2. 



7.3 A Single- Warehouse Multiretailer Model 

7 . 3. 1 Introduction 

Many distribution systems involve replenishing the inventories of geographically 
dispersed retailers. Consider a distribution system in which a single warehouse 
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supplies a set of retailers with a single product. Each retailer faces a constant 
retailer-specific demand that must be met without shortage or backlogging. The 
warehouse faces orders for the product from the different retailers and in turn 
places orders to an outside supplier. A fixed, facility-dependent, setup cost is 
charged each time the warehouse or the retailers receive an order, and an in- 
ventory carrying cost is accrued at each facility at a constant facility-dependent 
rate. The objective is to determine simultaneously the timing and sizes of retailer 
deliveries to the warehouse as well as replenishment strategies at the warehouse 
so as to minimize the long-run average inventory purchasing and carrying costs. 

In the absence of a fixed setup cost charged when the warehouse places an order, 
the problem can be decomposed into an economic lot size model for each retailer. 
That is, the existence of this cost ties together the different retailers, requiring 
the warehouse to coordinate its orders and deliveries to the different retailers. It 
is well known that optimal policies can be very complex, and thus the problem 
has attracted a considerable amount of attention in recent years (see Graves and 
Schwarz 1977, Roundy 1985). The latter paper presents the best approach cur- 
rently available for this model; it suggests a set of power-of-two reorder intervals 
for each facility and shows that the cost of this solution is within 6 % of a lower 
bound on the optimal cost. In this section, we present this method along with the 
worst-case bound. 

1.3.2 Model and Analysis 

Consider a single warehouse (indexed by 0) that supplies n retailers, indexed 
1, 2, ... ,n. We will use the term facility to designate either the warehouse or a 
retailer. We make the following assumptions. 

• Each retailer faces a constant demand rate of Di units, for i = 1, 2, . . . , n. 

• The setup cost for an order at a facility is A^, for i = 0, 1, . . . , n. 

• The holding cost is h' 0 at the warehouse and h[ at retailer i, with h\ > h' 0 
for each i = 1, 2, . . . , n. 

• No shortages are allowed. 

As demonstrated by several researchers, policies for this problem may be quite 
complex, and thus it is of interest to restrict our attention to a subset of all feasible 
policies. A popular subset of policies is the set of nested and stationary policies. A 
nested policy is characterized by having each retailer place an order whenever the 
warehouse does. As in the previous section, stationarity implies that reorder inter- 
vals are constant for each facility. It is easy to show that any policy should satisfy 
the zero-inventory-ordering property. Roundy (1985) showed that, although ap- 
pealing from a coordination point of view, nested policies may perform arbitrarily 
badly in one- warehouse, multiretailer systems. We therefore will not restrict our- 
selves to nested policies. We concentrate on policies where each retailer’s reorder 



7.3 A Single- Warehouse Multiretailer Model 131 



intervals are a power-of-two multiple of a base planning period T#. Below, we as- 
sume the base planning period is fixed. The worst-case bound reduces to 1.02 if it 
can be chosen optimally although we omit this extension. 

Let’s first determine the cost of an arbitrary power-of-two policy T = {To, Ti, . . . , 
T n } that satisfies the zero-inventory-ordering property. If we consider the inven- 
tory at the warehouse, then it does not have the saw-toothed pattern. To overcome 
this difficulty, it is convenient to introduce the notion of system inventory as well 
as echelon holding cost rates. Retailer V s system inventory is defined as the inven- 
tory at retailer i plus the inventory at the warehouse that is destined for retailer 
i. If we consider the system inventory of retailer i, then it has the saw-toothed 
pattern. Echelon holding cost rates are defined as ho = h' 0 and hi = h[ — h' 0 . For 
simplicity, define gi = \h{Di and g % = \hoDi for each z = l,2,...,n. To compute 
the cost of such a policy, we separate each item in the warehouse’s inventory into 
categories depending on the retailer for which the item is destined. Let 77i(To,T^) 
be the average cost of holding inventory for retailer i at the warehouse and at 
retailer i. We claim 



To prove this, consider the two cases: 

Case 1: Ti > To. Since T is a power-of-two policy, Ti > To implies that the 
warehouse places an order every time the retailer does. Therefore, the warehouse 
never holds inventory for retailer z, and the average holding cost is 



Case 2: Ti < To. Consider the portion of the warehouse inventory that is destined 
for retailer i. Using the echelon holding cost rates, that is, inventory at retailer i 
is charged at a rate of hi and system inventory is charged at a rate of ho , we have 



Our objective then is to find the power-of-two policy T that minimizes (7.6). 

Our approach to solving this problem is to first minimize the average cost over 
all vectors T > 0. That is, we solve this problem when the restriction to power-of- 
two vectors is relaxed. We then round the solution T to a vector whose elements 
are the power-of-two multiple of Tg. 

For a fixed value of To, we consider the following problem: 



(T 0 , Ti) = giT + g l max{T 0 , T*}. 





Therefore, the average cost of a power-of-two policy T is given by 




(7.6) 




(7.7) 
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Ki 



To solve this problem, let r[ = g ^ g 
alii > 1. Then one can show that 



let Tj = 



and note that r- < Tj for 



bi (To) = { 



2\/ Kj{gi + g l ) 

fjf + (di + g l )To 



To 



■gT b 



if T 0 <r', 
if r/ < To < r i; 
if n < To. 



That is, if To < r/, it is best to choose T* 
T* = T( } . If To > r,, it is best to choose T* = 
We now consider minimizing 

B(T 0 ) = ^ 



2=1 



If 



bi (To) 



< To < Ti, then choose 



over all Tq > 0. The function 5 is of the form 

tf(ro) 



To 



M (To) + H(T 0 )T 0 



over any interval where KQ, M(), and iL() are constant. For any To, define the 
sets G(T 0 ) = {i : T 0 < r-j, E(T 0 ) = {i : r/ < T 0 < r/}, and T(T 0 ) = {i : < T 0 }. 

Then iT(), M(), and HQ are constant on those intervals where GQ, EQ, and LQ 
do not change. To find the minimum of T, consider the intervals induced by the 
2 n values 
we set 



and for i = 1,2,.. 


.,n. 


Say To falls in 






if i e G(T 0 ), 


T* = < 


To 


Hi eT(T 0 ), 




u 


if i e L(T 0 ). 



The sets G, E, and L change only when Tq crosses a breakpoint r[ or Tj for some 
i > 1. Specifically, if To moves from right to left across t*, retailer i moves from 
L to E. If Tq moves from right to left across r/, retailer i moves from E to G. 
This suggests a simple algorithm to minimize B(Tq). Start with To larger than the 
largest breakpoint, and let L = {1, 2, . . . , n} and G = E = 0. We then successively 
decrease To, moving from interval to interval. On each interval we need only check 

that falls in the same subinterval as To. In this case, we set T 0 * = 

since B(To) is strictly convex in To. Let B* = T(T 0 *) — inf j- 0 >o{^(^o)}; then this 
value is clearly a lower bound on the cost of any power-of-two policy. 

We now want to prove that this value is a lower bound on the cost of any 
policy. For notational convenience, we abbreviate G* = G(T 0 *), T* = T(T 0 *), and 
T* = T(T 0 *). Let K = K 0 + J2ieE* G = + <?) + E* g l* 9\ and 

M = 2 \JKG. We also define for each i > 0 



9i + 9 % 
9i, 

Ki 



( T o) 2 



if ie G*, 
if i e T*, 
if i e T* U{0}, 



Gi = 
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G 1 = g l + gi — Gi, and Mi = 2 y/KiGl. In this way, we can write £?* as 

B* = M+ E Mi. (7.8) 

ieL*UG* 

We now prove that B* is a lower bound on any policy. We first show that, in 
fact, B* = X^>o Mi- F r c> m (7.8), we need only show that M = X]ie£*u{o} Mi, 

M = 2 \[KG = 



Consider any policy over an interval [0, t'] for t' > 0. We show that the total 
cost associated with this policy over [0 ,t'] is at least B*t' . Let rrii be the number 
of orders placed by facility i > 0 in the interval [0 ,£']. Let Ii(t) be the inventory 
at facility i > 1 at time t , and let Si(t) be the system inventory of facility i > 1 at 
time t. Clearly, the total inventory holding cost is 

Y f (hili(t) +h 0 Si(tf)dt. 

i> i ”'° 

We will show that this is no smaller than 

El (Hiii(t)+H%(t)yt, 

i> 1 

where Hi = and H l — for each i = 0, 1, . . . , n. For this purpose, consider 

the quantity Hili(t) + H l Si(t) for each i > 1. There are three cases to consider. 

Case 1: i e G* . Then Gi = gi+g l and G l = gi+g l — Gi = 0, and since Si(t) > Ii(t ) 
for allt > 0, we have 

hili(t) + h 0 Si(t) > Hili{t ) + WSiit). 

Recall that hi = ^ , ho = , Hi = , and H l = . 

Case 2: i G L*. Then Gi = ^ and G l = gi + g l — Gi = g l ; hence, 

+ h 0 Si(t) = Hili(t) + WSiit). 



K 



rr >* 

1 0 




E 


Ki 

Tn* 


i€E* U{0} 


0 


E 

ie£*u{0} 


Ki 

V^i/Gi 


E 


VK~Gi 


ie£*u{0} 




E M i- 



i£E* U{0} 
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Case 3: i G E * . Then G{ = and G l = gi+g l — Gi. Observe that, by definition, 

if i G E, then r[ < T 0 * < r*, which implies gi < Gi < gi + g l . Since Si(t) > Ii(t) 
for allt > 0, then 



hiUii) + h 0 Si(t) = Hili(t ) + WSiit) + (if* - hi)(Si(t) - Ii(t )) 

(7.9) 

Therefore, our lower bound on the inventory holding cost can be written as 

V f (H i I i (t) + H i S i (tj)dt = y2 f Hili(t)dt, 

i> l Jo i>0 J ° 

where we have defined Io(t) = JT>i H l Si{t). 

Hence, the total cost per unit of time under this policy is at least 

+ h m»)^Y.( k ‘y + g ‘B 

>2 53 VK^Gi + 2 53 

iGL*UG* zG-E i *U{0} 

= 53M i = B*, 

i>0 

where the first inequality follows from Property 7.1.1 and the fact that Go = 
G l (see Exercise 7.7). We have thus established that B* is a lower bound on 
the total cost per unit time of any policy. 

Finally, for each i G G* UP, select a power-of-two policy (a value of k) such 
that 

J-77 < Tb 2 k < V2 T* 

V2 

For each i G P* U {0}, select a power-of-two policy (a value of k) such that 

T Tq * < T B 2 k < \/2Tq. 
v 2 

It is a simple exercise (Exercise 7.4) to show that the policy constructed in this 
manner has cost at most 1.06 times the cost of the lower bound. 



7.4 Exercises 



Exercise 7.1. Consider the economic lot size model, and let K be the setup cost, 
h the holding cost per item per unit of time, and D the demand rate. Shortage is 
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not allowed and the objective is to find an order quantity so as to minimize the 
long-run average cost. That is, the objective is to minimize 






hQ 

T’ 



where Q is the order quantity. Suppose the warehouse can order only an integer 
multiple of q units. That is, the warehouse can order q , or 2 g, or 3 g, and so on. 



(a) Prove that the optimal order quantity Q* has the following property. There 
exists an integer m such that Q* = mq and 



m- 1 < Q- < 
m Q* 



m + 1 



m 



where Q e , the economic order quantity, is 



Q e = 



2 KD 



(i b ) Suppose now that m >2. Show that C(Q*) < 1.06C(Q e ). 

Exercise 7.2. (Zavi 1976) Consider the economic lot size model with infinite 
horizon and deterministic demand D items per unit of time. When the inventory 
level is zero, production of Q items starts at a rate of P items per unit of time, 
P > D. The setup cost is K$ and the holding cost is h§ /item/time. Every time 
production starts at a level of P items/time, we incur a cost of aP, a > 0. 

(a) What is the optimal production rate? 

( b ) Suppose that due to technological constraints, P must satisfy 2D < P < 3D. 
What are the optimal production rate and the optimal order quantity? 



Exercise 7.3. Consider the economic lot size model over the infinite horizon. 
Assume that when an order of size Q is placed, the items are delivered by trucks 
of capacity g, and thus the number of trucks used to deliver Q is [^] , where \m\ is 
the smallest integer greater than or equal to m. The setup cost is a linear function 
of the number of trucks used: It is Kq + The holding cost is h $/item/time, 

and shortage is not allowed. What is the optimal reorder quantity? 



Exercise 7.4. Prove that the heuristic for the single-warehouse, multiretailer 
model described in Sect. 7.3 provides a solution within 1.06 of the lower bound. 



Exercise 7.5. Consider the power-of-two policies described in the single-product 
model of Sect. 7.1.3. Describe how you could generate a power-of-three policy (a 
policy where each Ti = 3 k Ts for some integer k > 0). What is the effectiveness 
(in terms of worst-case performance) of the best power-of-three policy? 
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Exercise 7.6. (Porteus 1985) The Japanese concept of JIT (just-in-time) advo- 
cates reducing setup cost as much as possible. To analyze this concept, consider 
the economic lot size model with constant demand of D items per year, holding 
cost h $ per item per year, and current setup cost Kq. Suppose you can lease 
a new technology that allows you to reduce the setup cost from Kq to K at an 
annual leasing cost of A — Bln(K) dollars. That is, reducing the setup cost from 
the current setup cost, Kq, to K will annually cost A — Bln(K) dollars. Of course, 
we assume that A — Sln(Ko) = 0, which implies that using the current setup 
cost requires no leasing cost. What is the optimal setup cost? What is the optimal 
order quantity in this case? 

Exercise 7.7. Show that in the proof of the lower bound, £>*, for the single- 
warehouse, multiretailer model, we have Go = i G 2 . 

Exercise 7.8. Prove (7.9). 



8 

Economic Lot Size Models 
with Varying Demands 



Our analysis of inventory models so far has focused on situations where demand 
was both known in advance and constant over time. We now relax this latter 
assumption and turn our attention to systems where demand is known in advance 
yet varies with time. This is possible, for example, if orders have been placed 
in advance, or contracts have been signed specifying deliveries for the next few 
months. In this case, a planning horizon is defined as those periods where demand 
is known. Our objective is to identify optimal inventory policies for single- item 
models as well as heuristics for the multi-item case. We also present extensions to 
single-item models with price-dependent demand. 



8.1 The Wagner-Whitin Model 

Assume we must plan a sequence of orders, or production batches, over a T-period 
planning horizon. In each period, a single decision must be made: the size of the 
order or production batch. 

We make the following assumptions: 

• Demand during period t is known and denoted by d t >0. 

• The per-unit order cost is c and a fixed order cost K is incurred every time 
an order is placed; that is, if y units are ordered, the order cost is cy + K5(y ) 
[where 5(y) = 1 if y > 0, and 0 otherwise]. 

• There is a holding cost h > 0 per unit per period. 
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• Initial inventory is zero. 

• Lead times are zero; that is, an order arrives as soon as it is placed. 

• All ordering and demand occur at the start of the period. Inventory holding 
cost is charged on the amount on hand at the end of the period. 

The problem is to decide how much to order in each period so that demands 
are met without backlogging and the total cost, including the cost of ordering and 
holding inventory, is minimized. This basic model was first analyzed by Wagner 
and Whitin (1958b) and has now been called the Wagner-Whitin model. 

Let yt be the amount ordered in period t, and let It be the amount of product 
in inventory at the end of period t. Using these variables, we can formulate the 
problem as follows: 



Problem WW : Min + cy t + hl t 

t= l 

s.t. It = It - 1 + yt ~ dt, t = 1, 2, . . . , T, 
Io = 0 ; 

It, Vt >0, t = 1,2, ...,T. 



(8.1) 

( 8 . 2 ) 

(8.3) 



Here constraints (8.1) are called the inventory-balance constraints, while (8.2) sim- 
ply specifies the initial inventory. Note that the inventory can also be rewritten 
as I t = Y^i=i(yi ~ di)? and therefore the I t variables can be eliminated from the 
formulation. 

In the above model, it is clear that the total variable order cost incurred will be 
fixed and independent of the schedule of orders, and thus we ignore this cost in 
our analysis until we talk about models with price-dependent demand. 

Wagner and Whitin make the following important observation. 

Theorem 8.1.1 Any optimal policy is a zero-inventory- ordering policy, that is, a 
policy in which 

yJt-i =0, for t= 1,2,..., T. 



Proof. The proof is quite simple. By contradiction, assume there is an optimal 
policy in which an order is placed in period t even though the inventory level at 
the beginning of this period \I t - i] is positive. We will demonstrate the existence 
of another policy with a lower total cost. Evidently, the I t -\ items of inventory 
were ordered in various periods prior to t. Thus, if we instead order these items 
in period £, we save all the holding cost incurred from the time they were each 
ordered. I 

Thus, ordering only occurs when inventory is zero. A simple corollary is that 
in an optimal policy, an order is of size equal to satisfy demands for an integer 
number of subsequent periods. 
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Using the above property, Wagner and Whit in developed a dynamic program- 
ming algorithm to determine those periods when ordering takes place. By con- 
structing a simple acyclic network with nodes V = {1,2, ... ,T + 1}, we can view 
the problem of determining a policy as a shortest-path problem. Formally, let iij, 
the length of arc (i,j) in this network, be the cost of ordering in period i to 
satisfy the demands in periods i, i + 1, . . . , j — 1, for all 1 < i < j < T + 1. That is, 

3 - 1 

£ij = K T h ^ ^ (/c — i)dk- 

k=i 



All other arcs have Uj = Too. The length of the shortest path from node 1 to 
node T + 1 in this acyclic network is the minimal cost of satisfying the demands 
for periods 1 through T. The optimal policy, that is, a specification of the periods 
in which an order is placed, can be easily reconstructed from the shortest path 
itself. This procedure is clearly 0(T 2 ). 

Most of the assumptions made above can be relaxed without changing the basic 
solution methodology. For example, one can consider problem data that are period- 
dependent (e.g., c t , ht, or K t ). The assumption of zero lead times can be relaxed if 
one assumes the lead times are known in advance and deterministic. In that case, 
if an order is required in period £, then it is ordered in period t — L, where L is 
the lead time. 

Researchers have also considered order costs that are general concave functions 
of the amount ordered, that is, c t (y). The problem can be formulated as a network 
flow problem with concave arc costs. This was the approach of Zangwill (1966), 
who also extended the model to handle backlogging although the solution method 
is only computationally attractive for small problems. 

The Wagner- Whitin model can also be useful if demands during periods well 
into the future are not known. This idea is embodied in the following theorem. 

Theorem 8.1.2 Let t be the last period a setup occurs in the optimal order policy 
associated with a T -period problem. Then for any problem of length T* > T, it is 
necessary to consider only periods {j : t < j < T*} as candidates for the last setup. 
Furthermore, if t = T , the optimal solution to a T* -period problem has y t > 0. 

This result is useful since it shows that if an order is placed in period t, the optimal 
policy for periods 1,2 , ... ,t — 1 does not depend on demands beyond period t. 

Surprisingly, even though the Wagner-Whitin solution procedure is extremely 
efficient, often simple approximate yet intuitive heuristics may be more appealing 
to managers. For example, this may be the reason for the popularity of the Silver 
and Meal (1973) heuristic or the part-period balancing heuristic of Dematteis 
(1968). One important reason is the sensitivity of the optimal strategy to changes 
in forecasted demands d t , t = 1,2,...,T. Indeed, in practice, these forecasted 
demands are typically modified “on the fly.” These changes typically imply changes 
in the optimal strategy. Some of the previously mentioned heuristics are not as 
sensitive to these changes while producing optimal or near-optimal strategies. For 
another approach, see Federgruen and Tzur (1991). 
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Researchers have shown that it is possible to take advantage of the special 
cost structure in the Wagner-Whitin model and use it to develop faster exact 
algorithms [i.e., 0(T)\. This includes the work of Aggarwal and Park (1993) and 
Park, Federgruen and Tzur (1991) and Wagelmans et al. (1992). 

We sketch here the 0{T ) algorithm of Wagelmans et ah, which is the most 
intuitive of those proposed. It is a backward dynamic programming approach. 
Define dij = Ylt=i dt for h j = 1, 2, . . . ,T, that is, the demand from period i to 
period j. To describe the algorithm, we will slightly change the way we account 
for the holding cost. If an item is ordered in period i, then we are charged Hi = 
(T — i + l)h per unit. That is, we incur the holding cost until the end of the time 
horizon. As long as we remember to subtract the constant h J2i=i du from our 
final cost, then we are charged exactly the right amount. With this in mind, define 
G(i) to be cost of an optimal solution with a planning horizon from period i to 
period T, for i — 1, 2, . . . , T. For convenience, define G{T + 1) = 0. Then 

G(i) = min {K + H^t-i + G{t)} 

i<t<T+ 1 

= K + min {Hid ijt -i + G(t)}. (8.4) 

i<t<T + 1 

The final cost is then G( 1) — hY^J=idu- Using this recursion, which is just a 
reformulation of the shortest-path recursion discussed earlier, we clearly find that 
the complexity is 0(T 2 ). Wagelmans et al.’s 0(T ) algorithm is based on the crucial 
observation that with careful implementation, the total amount of time spent 
finding the period that minimizes (8.4) over the entire running of the algorithm 
is 0(T). 

Consider the calculation of G(i). It is useful to plot the points ( djT,G(j )) for 
j = i+ 1, i+ 2, . . . , T+l, where the point (cJt+i,Tj G{T+ 1)) is simply the origin. Let 
£ be the lower convex envelope of these points; then define the function g(x) = y 
if and only if (x,y) E £. It is clear that g is a piecewise linear convex function on 
[0, with g(di+i^r) = G(i + 1) and g(0) = 0. See Fig. 8.1. 

Define the breakpoints of g to be all the points x where g changes slope in 
addition to the points x = 0 and x = di+ i 5 t- If x is a breakpoint, then x = djr 
for some period j E {i + l,i + 2,...,T + l}. Let there be r breakpoints and let 
i + 1 = t( 1) < t( 2) < . . . < t(r) = T + 1 denote the corresponding periods. These 
periods are called efficient because of the following. 

Theorem 8.1.3 



min {Hid iit - 1 + G(t)} = min {Hid i t ( p )- i + G(t(p))}. 

:t<T+ 1 1 <p<r ’ 



Proof. Suppose that j (with i + 1 < j <T + l)is not an efficient period, and let k 
and £ (with k < j < t) be the two consecutive efficient periods straddling j. The 
slope of g on [dgrffikT] is equal to [G(k) — G(£)]/dk,t- 1 ; hence, 



g(djT ) — G{t) + 



G{k) - G(£) 
dh£-i 



dj,£— l • 



Furthermore, G(j) > g(djT )• 
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There are two cases to consider. 

Case 1: Hi > Then 

— dk,t- i 

Hidij-i + G(j) > Hidi ^_ i + Hidfcj-i + g(djT) 

G(k) - G(£) 



> Hidi k-i + i 

d k ,£- 1 

— Hidi.k-i + G(k). 



dk.j-i + G{£) + 



G[k) - G(£) 
dk,l—l 



dj,£— l 



Case 2: Hi < G W~ G W . Then 

Hidij — i + G(j) > Hidii_i — Hidj£_i + g(djT) 

G{k) - G{£) 



^ Hi.dij — i 



dk,£-l 



dj£ — l + + 



G(fc) - (3(1) 
dk ,£- 1 



dn f — 1 



= M jM + gw. 

In both cases, the minimum occurs at an efficient period. I 

Being able to quickly find the efficient period p that achieves the minimum is 
therefore crucial to the complexity of the algorithm. This step is aided by the 
following result. 

Lemma 8.1.4 Let k and £, k < £ be two consecutive efficient periods. If 

G(k) - G{1) ^ 

1 < 

dk,£-i 

then 

Hid^ k -i + G(k) < Hidij — i + G(£); 

otherwise, 

Hidit-! + G(k) > Hidij-i + G(£). 
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Proof. Suppose that < H, - then G(k) < H,d k + G(f). Adding 

Hidi t k - 1 to both sides results in Hid^k-i + G(k) < Hid^i-i + G(£). The other 
case can be shown in a similar fashion. I 

We now describe specifically how to find the efficient period achieving the min- 
imum in (8.4). This is done by keeping an up-to-date list L of the current efficient 
periods. Let £{p) be the index of the efficient period immediately following efficient 
period p ; that is, p < £{p). From Lemma 8.1.4 and the convexity of g , it follows 
that the value of j that achieves the minimum of 



min {Hidij-i + G(j)} 

i<j<T+l J 



corresponds to the period q(i) defined by 



q{i) = min 
because then 



T + 1 , min \peL\p<T + l and 



G(p) - G(£(p)) 



d, 



< H, 



vAp)- 1 



}]■ 



Hid it p- 1 + G(p) > + G(£(p)), for p e L and p < q(i), 

and 

1 + G(p) < + G(£(p)), for p e L and p > q{i). 

In fact, it is easy to determine q(i) from q(i + 1). Note that q{i + 1) G L and 
as long as q{i + 1) is efficient, it has the same successor £{i + 1) in L. Using the 
definition of q(i + 1), we obtain 

G(q(i + 1)) - G(£(q(i + 1))) 

< JrL i+ i S t±i. 

Hence, it follows that q(i) < q(i + 1); that is, the values of q{i) are decreasing in 
i. Therefore, starting at q{i + 1), we successively decrement by one until we find 
q(i). The total amount of time spent searching for q(i) in the entire algorithm is 
therefore 0{T). 

To complete the complexity result, we must be able to quickly update the list of 
efficient periods, that is, update the lower convex envelope. After calculating G(i) 
and plotting the point (d^,G(z)), we search for the smallest efficient period t(s) 
such that the slope of the line segment connecting (diT,G(i)) to (dt(s),T, G(t(s))) 
is greater than the slope of the line segment connecting (dt(s+i),T> G(t(s + 1))) to 
G(t(s))) (thus maintaining convexity). Then the new efficient periods are 
i and the periods from t(s) to t{r) = T + 1; the efficient periods between i + 1 and 
t(s) — 1 become inefficient. Since a period can become inefficient at most once, one 
can verify that the total amount of work spent updating the list L over the entire 
algorithm is 0{T). 
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8.2 Models with Capacity Constraints 

An important generalization of the Wagner-Whitin model is the inclusion of upper 
bounds on the amount that can be ordered or produced in a given period. This 
corresponds to adding the following constraints to Problem WW: 

Vt<Ct, (8.5) 

The values C t > 0 correspond to the maximum amount that can be ordered (or 
produced) in period t due to, for example, limited production capacities. 

In this case, the problem is not as simple as before; Florian et al. (1980) show 
that, in general, the problem is A7P- Complete . Florian and Klein (1971) pro- 
pose a dynamic programming approach that involves solving a sequence of acyclic 
shortest-path problems for the special case where Ct = C for all t. Love (1973) de- 
vises an algorithm based on characterizing the extreme points of the solution space 
for the general problem. The branch-and-bound algorithm of Baker et al. (1978) 
seems to be the most computationally effective although it is not polynomial. 

We sketch here the approach of Florian and Klein. For now, assume unequal 
capacities; most of the structural results proved by Florian and Klein hold in this 
more general case. Clearly, a feasible solution exists if and only if 

i i 

for i = 1,2, ... ,T. 

3 = 1 3=1 

We therefore assume this is satisfied. Let 

V = {y E M t : y satisfies (8.1), (8.2), (8.3) and (8.5)}, 

and let D be the set of extreme points of V. Since the objective function is concave 
(why?), we know an optimal solution will exist in D. 

Florian and Klein prove the following inventory decomposition property. 

Theorem 8.2.1 Suppose that the constraint 



h = 0, for some k G [1, . . . , T — 1], 



is added to Problem WW and 



5? Cj > ^2 d v for i = k + 1, . . . ,T 

j=k+ 1 j=k- jKt 



holds. Then an optimal solution to the original problem can be found by indepen- 
dently finding solutions to the problems for the first k periods and for the last T — k 
periods. 
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This is clearly a generalization of Theorem 8.1.2. Following this idea, call a 
period t a regeneration point if It = 0. Define a production sequence Sij, where 
0 < i < j < T, to be 

Sij = {(Vi+i,Vi+ 2 , ■ ■ ■ iVj) I li = Ij = o, I k > o for i < k < j}. 

Clearly, any production plan can be decomposed into a set of production sequences. 
Define a production sequence Sij to be capacity- constrained if the production level 
in at most one period k (i + 1 < k < j) satisfies 0 < yk < C k and all other 
production levels are either zero or at their capacities. 

The authors then characterize the extreme points of V in the following way. 
Theorem 8.2.2 

y E D <^=4> y consists of capacity- constrained production sequences only. 

This characterization is done in several steps. First: 

Lemma 8.2.3 If y G D, then y consists only of capacity- constrained production 
sequences. 



Proof. Suppose y G D and Sij is a production sequence of y that is not capacity- 
constrained. This means there are at least two periods, say k and i (i + 1 < k < 
£ < j), in which 0 < yk < Ck and 0 < yt < Ct- Without loss of generality, we can 
assume there are only two periods of this type. 

Let 

5 = - min {y k , C k - y k , ye, C e - ye, min I t }, 

Z 

and let c n be the ( j — i)-component vector with a 1 in the n th position and Os 
everywhere else. Define two production sequences 



S[j = - Se k -i + det-i 



and 



S'ij — Sij + de k -i — dei-i. 



Note that production sequence S[ - simply represents a shifting of production from 
period k to period £, while sequence S'l represents the opposite shift. They are 
clearly feasible, and since S > 0, they are distinct. However, Sij = \{S[j + S f I ) , a 
contradiction. I 



Lemma 8.2.4 If y' and y" are distinct feasible production plans and y = \(y r + 
y"), then y' and y" share all the regeneration points of y. 



Proof. Let period He a regeneration point of y. Then 

k ^ k k 



0 = Y.iyt - dt) = \[j2(y' t - dt) + J2(y{ - d t )] = \{I' k + I'i). 

t= 1 t = 1 t = 1 



Since I ' k , I f f > 0, both I' k and I k must be zero. 
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Lemma 8.2.5 If a feasible plan y consists only of capacity- constrained production 
sequences, then y E D. 



Proof. Assume to the contrary that y 0 D. Then there exist feasible plans y' and 
y" such that y = |(?/ + y"). 

From Lemma 8.2.4, y ' and y" share the regeneration points of y. Let i and j be 
two such successive regeneration points, and let Sij, S '- , and S'l be the associated 
distinct production sequences of y, y' , and y " , respectively. Evidently, 

Si j = \(S' j + S? j ). 

We show that the only possibility is Sij = S'- = S'l . For this purpose, consider any 
period k, i + 1 < k < j, and observe that y k can take only three possible values. 
Either yk = 0, in which case y' k = y k = 0, or y k = Ci c , in which case y' k — y' k = C k 
or 0 < y k < C k - Since Sij is a capacity-constrained sequence, at most one period, 
say period I, i + 1 < I < j, has 0 < yg < Cg. But the total production between 
period i + 1 and period j must be equal to total demands over the same periods, 
and hence yg = y'g = y'f . Consequently, Sij = S'- = S' I . I 

This completes the proof of Theorem 8.2.2. 

It is now clear that an optimal solution must be made up of a sequence of 
optimal capacity-constrained production sequences. However, determining these 
sequences can be quite tedious and computationally expensive. To make the prob- 
lem tractable, Florian and Klein consider the case where the capacity constraints 
are identical and equal to C. The demand between any two periods, say periods i 
and j, can then be written as mC + p , where m is an integer and p < C. Then 

Corollary 8.2.6 If Ct = C for all t, an optimal production sequence has a number 
of periods in which the production levels are equal to C, at most one period where 
the production level is 0 < p < C, and the remaining periods have zero production 
levels. 

This simplifies the problem considerably; for example, consider determining the 
optimal production sequence between regeneration points i and j. From Corol- 
lary 8.2.6, in each period k E {i + 1, i + 2, . . . , j}, the production is 0, C, or p for 
some p E (0, C). Let Y k = X^=i+i Vk^or i < k < j , that is, the amount produced 
between periods i + 1 and k in this production sequence. Then Y k can only take 
on values in {0,p, C,C + _p, 2(7, . . . , m(7, mC + p}. 

Thus, we can construct a network where the vertices correspond to the possible 
values of Y k for each i < k < j with directed edges (Yk,Y k + 1 ) defined by the 
following: 

• If Yk = IC , I = 0, 1, . . . , m, then there are three edges emanating from this 
vertex: one to Y k + i = IC (corresponding to no production in period fc); one 
to Y k + 1 = IC + p (corresponding to production of p in period fc); and one to 
Y k + 1 = (£ + l)C (corresponding to production of (7 in period k). 
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• If Yk = £C+p, i = 0, 1, . . . , m, then there are two edges emanating from this 
vertex: one to Yk + 1 = tC + p (corresponding to no production in period fc) 
and one to Y^+i = (£+l)C+p (corresponding to production of C in period fc). 

After creating an artificial initial vertex lo? we see that every path from Yq to 
Yj represents a feasible capacity-constrained production sequence. If we assign 
arc costs equal to the cost of producing and storing the corresponding product 
amounts, it is clear that finding the optimal production sequence from i to j is no 
harder than solving the shortest-path problem on this network. The complexity of 
this procedure is clearly proportional to (j — z) 2 , thus determining that the optimal 
production sequence between all pairs of periods is 0(T 4 ). 

To determine the optimal production plan over the entire planning horizon, 
Florian and Klein solve another shortest-path problem on a network similar to 
the one formulated in Sect. 8.1. That is, the length of an arc (z, j) in this network 
is the total cost of the optimal production sequence from z to j. After solving 
the shortest-path problem, we can find the optimal set of regeneration points by 
checking the shortest path. This step is 0(T 2 ). 

8.3 Multi- Item Inventory Models 

In many practical situations, the coordination of inventory and ordering policies 
involves a variety of different products, and this complicates the problem consider- 
ably. Consider the uncapacitated case once again, and assume there are n products. 
Each product faces a known demand during the next T periods. In addition, a fixed 
order cost of K{ is incurred every time product z is ordered. 

For each product z, define the following: 

• Let yu be the amount of product z ordered in period £, for t = 1, 2, . . . , T. 

• Let hi be the inventory holding cost for product z. 

• Let la be the amount of product z in inventory at the start of period £, for 
t = 1, 2, . . . , T. 

• Let da be the demand in period t for product z, for t = 1, 2, . . . , T. 

If we make the same assumptions as in the Wagner-Whitin model, the problem 
is then 

T n 

Problem P : Min [KiS{y it ) + hilit] 

t = 1 i= 1 

s.t. la = + Vit ~ d it , t= 1,2, (8.6) 

Iio = 0, i = l,2,...,n, (8.7) 

lit, Vit > 0, i — 1, 2, . . . , n, t is= 1, 2, . . . , T. (8.8) 

Here (8.6) are inventory-balance constraints for each product, while (8.7) specify 
the starting inventory for each product. 
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It is easy to see that P decomposes into m single-product problems. Each of 
these single-product problems can be solved using the algorithms for the Wagner- 
Whitin model. 

A more realistic version of this problem is when a joint setup cost Kq is present. 
This cost is incurred whenever any product is ordered. The problem then becomes 

T n m 

Problem P' : Min [k^C^vu) + (■ K AVit ) + hJit )] 

t= 1 i= 1 i=l 

s.t. (8.6), (8.7), and (8.8). 

Unfortunately, this problem is considerably more difficult to solve than the 
simple Wagner-Whitin model. In fact, Arkin et al. (1989) prove that it is MV- 
Complete. Several researchers have proposed heuristics for this problem, includ- 
ing Silver (1976), Atkins and Iyogun (1988), and Joneja (1990). We present here 
Joneja’s approach. 

Joneja’s cost-covering heuristic proceeds period by period in a forward direction. 
Specifically, at period £, the ordering policy of periods 1,2, ...,t — 1 has been 
determined and the decision is which items to order, if any, in period t. Let U be 
the last period in which item i was ordered. Let Hu denote the total inventory 
holding cost incurred by item i since period U assuming no order for item i is 
placed in period t. That is, 



t 

Hn — hi ^ ^ (j ti)dij. 

j=ti+i 

Intuitively, if we forget for the moment the joint order cost and Hu > Ki , then 
it is worth ordering item i in period £, since it costs more to keep an item in 
inventory from period U (the last time item i was ordered) to t than to order it 
in period t. The quantity max{iL^ — -Ki, 0} can be seen as the savings that are 
accrued by ordering item i in period t. This approach is basically the Silver-Meal 
heuristic adapted to the multiple- item case. With the joint order cost present, an 
order should be placed only if the total savings accrued by ordering a set of items 
in period t exceeds the joint order cost. Therefore, Joneja proposes the following 
ordering rule. 

Rule 1. In period t, order those items i such that Hu > Hi if max {^it — 

Hi, 0} > K 0 . 

Joneja shows that this single rule is not quite strong enough to ensure that the 
schedule of orders is cost-efficient. For instance, consider the following example 
with two products. The holding costs are equal (hi = = 1). Pick an integer m 

and set the demands to 



148 



8. Economic Lot Size Models with Varying Demands 



di t = 0, for t = 1, 2, . . . , m — 1, 



tfo + tfi 



m — 1 

d>2t =0, for t = 1, 2, . . . , ra, 



K 0 + K 2 



With Rule 1, item 1 will be ordered at time ra, but not item 2. Item 2 will 
be ordered at time m + 1. If both items were ordered at time m, then we pay 
^ 2 ^ 2 , m+i = in extra holding costs but save i^o in ordering costs. Therefore, 

for large m, we see that we can be far from optimal. 

To counteract this behavior, Joneja proposes the following additional feature. 
Let to be the time at which the last joint order was placed, and assume item i was 
not included in this order (since Hi to < Ki). It may, in some cases, be advantageous 
to order item i at time to even though Rule 1 would specify the opposite. Define 



Then Su is the savings in inventory holding cost accrued by ordering item i at 
time to- Since a joint order is already placed in period to, the following rule was 
proposed. 

Rule 2. In period t, if the last joint order was in period to, item i was not ordered 
in period to, and Su > Ki , then order item i in period to- 

Computational experiments with this heuristic, whose complexity is 
O(nT), show that it produces solutions fairly close to optimal. 



The previous models focusing solely on inventory replenishment can be naturally 
extended to settings in which demand is endogenously determined by pricing deci- 
sions. For simplicity, we only consider the extension of Problem WW. Specifically, 
assume that at the beginning of period t (t = 1, 2, . . . , T), we can set a selling price 
p t in addition to the ordering quantity y t . The demand in period t is assumed to 
be a continuous function of the current period selling price p t , denoted as d t (pt )• 
We are now facing an integrated inventory and pricing model. The objective is to 
find a sequence of order quantities x t and prices p t so as to maximize the total 
profit over the planning horizon. 




j=t 0 
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Upon denoting a pricing plan (pi,P2, • • • ,Pt) and its corresponding demand 
sequence (di(pi), ^ 2 (^ 2 ), • • • ,^t(pt)) 5 we find that a mathematical model for the 
integrated inventory and pricing problem is 



where a lower bound p t and an upper bound p t on the selling price pt are imposed 
to prevent a low profit margin and an unreasonable high price, respectively. In 
the objective function of the above problem, the first term is the total revenue, 
and the second term C(di(pi), ^ 2 (^ 2 ), • • • , ^t(pt)) is the minimum ordering and 
inventory holding cost over the planning horizon for a given pricing plan, which is 
the optimal objective value of Problem WW when (di, cfo, . . . , dr) is replaced by 
(di(pi), ^ 2 (^ 2 ), • • • , dripr))- Notice that unlike the models in the previous sections, 
the variable order cost cannot be ignored here. 

To develop an efficient algorithm, observe that Theorem 8.1.1 still holds; that is, 
any optimal replenishment policy is a zero-inventory-ordering policy. To see this, 
simply note that given any fixed price sequence (pi,P 2 , • • • ,Pt), Problem (8.9) is 
equivalent to Problem WW. It is clear that for any ordering plan with the zero- 
inventory-ordering property, it suffices to specify the ordering periods. Specifically, 
if periods i and j ( i < j) are two consecutive ordering periods in such an ordering 
plan, the zero-inventory-ordering property implies that the demand at period t 
(i < t < j) is filled by the order placed at period i only, and thus the marginal 
cost of satisfying period t’s demand is given by c+ ( t — i)h . The associated optimal 
price for period t with i <t < j can then be derived, independent of other periods’ 
prices, by solving the following optimization problem: 



where the objective function is the profit of period £, when taking into account the 
marginal cost of satisfying the demand of that period. The dynamic programming 
algorithm of Wagner and Whitin can be easily adopted here. We can construct 
the same acyclic network. The only difference is that when we define the acyclic 
network, the length of arc (i,j) with l<i<j<T+lis given by 



which is the negative of the maximum total profit obtained from satisfying demand 
from period i to period j — 1 with a single order at period i. With this modification, 
the length of a shortest path from node 1 to node T + 1 in the acyclic network 
gives the negative of the maximum total profit over the planning horizon. The 
optimal ordering periods can be easily reconstructed from the shortest path, and 
the optimal prices can be derived through Problem (8.10) once the corresponding 
ordering periods are known. The algorithm involves solving 0(T 2 ) single- variable 



Max Yjt=iPtdt(pt) ~ C(di(pi),d 2 (p 2 ), • ■ • ,d T (pr)) 
s.t. p t e\p t ,p t },t = l,2,...,T, 



(8.9) 



v it = max p t d t {pt ) - (c + (t - i)h)d t (jp t ) 
s.t. p t £ \p t ,p t ], 



( 8 . 10 ) 



j d\ ^ ( Vn 1 
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optimization problems of the form (8.10) and finding a shortest path in the acyclic 
network that can be done in 0(T 2 ). 

The extension presented here first appeared in Wagner and Whitin (1958a) 
around the same time they developed Problem WW. The algorithm in Sect. 8.2 
can also be easily modified to deal with integrated inventory and pricing models 
with capacity constraints. We refer to Deng and Yano (2006), Geunes et al. (2006), 
and Chen and Simchi-Levi (2012) for details. For integrated inventory and pricing 
models with stochastic demand, see Chap. 10. 



8.5 Exercises 



Exercise 8.1. Assume order costs are general concave and time-dependent func- 
tions of the number of items produced. Also, assume holding costs are general 
concave and time-dependent functions of the number of items held in inventory. 
Prove that the zero-inventory-ordering property holds in this general setting as 
well. 

Exercise 8.2. The Silver-Meal heuristic works as follows. Let di,d 2 , 
. . . , d n be the demands in the n-period planning horizon. Define C{T ) to be the 
per-period average holding and setup cost under the condition that the current 
order covers demand in the next T periods. Then C( 1) = K, (7(2) = \{K + M 2 ), 
and so on. In the Silver-Meal heuristic, we calculate these until C(i) > C(i — 1). 
In this case, we stop and produce in period 1 to meet the demand of the first i — 1 
periods. We then start over with the ith period. 

Construct an example where the Silver-Meal heuristic provides a nonoptimal 
solution. 

Exercise 8.3. Consider the integrated inventory and pricing model in Sect. 8.4 
with one additional requirement that the prices are the same throughout the plan- 
ning horizon. Develop an efficient algorithm to solve it. Can you extend your 
algorithm to cases in which the parameters are time-dependent? 



9 

Stochastic Inventory Models 



9.1 Introduction 

The inventory models considered so far are all deterministic in nature; demand 
is assumed to be known and either constant over the infinite horizon or varying 
over a finite horizon. In many logistics systems, however, such assumptions are not 
appropriate. Typically, demand is a random variable whose distribution may be 
known. 

Stochastic inventory models have attracted considerable attention in the last 
three decades. The pioneering work of Arrow, Harris and Marschak (1951), Scarf 
(1960), Iglehart (1963a and b), and Veinott and Wagner (1965) for a single ware- 
house, Clark and Scarf (1960) for multi-echelon systems, Eppen and Schrage (1981) 
and Federgruen and Zipkin (1984a-c) for distribution systems, and Rosling (1989) 
for assembly systems all represent milestones in our understanding of complex 
stochastic logistics systems. More recently, the works of Zheng (1991), Zheng and 
Federgruen (1991), Chen and Zheng (1994), and Zipkin (2008) reveal new insights 
and provide more efficient algorithms for these problems. For recent reviews, we 
refer the reader to Lee and Nahmias (1993), Porteus (1990), and Zipkin (2000). 

In this chapter, we review some of the main results in stochastic inventory 
models. We start with the analysis of a single-warehouse model. To build our int- 
uition, Sect. 9.2 considers a single-period model. In Sects. 9.3 and 9.4, we show 
that the insight obtained in the previous section can be used to analyze a mul- 
tiperiod model. Section 9.5 extends the analysis further to the infinite- horizon 
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model. Models with positive lead times are addressed in Sect. 9.6. Finally, Sect. 9.7 
describes the development of interesting bounds on the optimal cost for multi- 
echelon systems. 



9.2 Single-Period Models 



9.2.1 The Model 



Consider a risk-neutral company that designs, produces, and sells winter fashion 
items such as ski jackets and coats. About six months before the winter season, the 
company must commit itself to specific production quantities for all its products. 
Since there is no clear indication as to how the market will respond to the new 
designs, these decisions are typically based on realized sales from the last few years, 
current economic conditions, and professional judgment. 

To assist management in selecting production quantities, the marketing depart- 
ment assumes that demand D for each new product is randomly distributed, gen- 
erated from a product-specific distribution with continuous cdf F(-). Additional 
information available to the decision makers includes the variable production cost 
per unit c, the selling price per unit r, and the salvage value per unit v. Clearly, 
these variables should satisfy r > c > v; otherwise, the problem can trivially be 
solved. 

Since demand is a random variable, the decision concerning how many units 
to produce is based on the expected cost z(y ), which is a function of the amount 
produced, y. This expected cost is 

z(y) = cy — rE[mm(y , D)] — vE[ max(0, y — D)] for y > 0, 

where E(-) denotes the expectation. Note that 

ry roo 

E[mm(y,D)]= DdF(D) +y dF(D). 

J 0 Jy 

Adding and subtracting the quantity r DdF(D) to z(y), we get 

roo ry 

z(y) = cy — rE[D] -r (y - D)dF(D) -v (y - D)dF(D). (9.1) 
Jy Jo 

The objective is, of course, to choose y so as to minimize the expected cost z(y). 
This is known as the newsboy problem or newsvendor problem. 

Taking the derivative of z(y) with respect to y and using the Leibnitz rule, we 
get the first-order optimality condition: 

c — r( 1 — Pr{D < y}) — v Pr{D < y} = 0, 



which implies that the optimal production quantity S should satisfy 



Pr {D <S} = 



r — c 



r — v 



9.3 Finite-Horizon Models 
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Since, by assumption, r — c<r — v and F(D) is continuous, a finite value 5, S > 0, 
always exists. In addition, it can easily be verified that the expected cost z(y) is 
convex for y E (0, oo) and that the value of z(y) tends to infinity as y — >> oo. 
Hence, the quantity S' is a minimizer of z(y). 

Observe that, implicitly, three assumptions have been made in the above anal- 
ysis. First, there is no initial inventory. Second, there is no fixed setup cost for 
starting production. Third, the excess demand is lost; that is, if the demand D 
happens to be greater than the produced quantity y, then the additional revenue 
r(D — y) is lost. 

The tools developed so far allow us to extend the above results to models with 
initial inventory yo and setup cost K. We now relax the first two assumptions. 
Observe that the expected cost of producing (y — yo) units is 

K -cy 0 + z(y). 

Hence, S clearly minimizes this expected cost if we decide to produce. Conse- 
quently, there are two cases to consider. 

1. If yo > S, we should not produce anything. 

2. If yo < S, the best we can do is to raise the inventory to level S. However, 
this is optimal only if — cyo + z(yo), the cost associated with not producing 
anything, is larger than or equals K — cyo + z(S), the cost associated with 
producing S — yo- That is, if yo < S , it is optimal to produce S — yo only if 
z(yo) >K + z(S). 

Let 5 be a number such that 



z(s)=K + z(S). 

The discussion above implies that the optimal policy is of the (s, S) type. 

Definition 9.2.1 An (s, S ) policy is a policy in which we order S — yo if the initial 
inventory level yo is at or below s, and do not order otherwise. 

The quantity S is called the order- up-to level , while s is referred to as the reorder 
point. In the special case with zero fixed ordering cost, we have s = S and the 
policy reduces to a base stock policy: When the initial inventory level is no more 
than S', make an order to raise the inventory level to S; otherwise, no order is 
placed. 



9.3 Finite-Horizon Models 

9.3.1 Model Description 

We are now ready to consider the finite-horizon (multiperiod) inventory problem. 
This problem can be described as follows. At the beginning of each period, for exa- 
mple, each week or every month, the inventory of a certain item at the warehouse 
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is reviewed and the inventory level is noted. Then an order may be placed to raise 
the inventory level up to a certain level. Replenishment orders arrive instantly. 
The cases with the nonzero lead times will be discussed in Sect. 9.6. 

We assume that demands for successive periods are independently and iden- 
tically distributed (iid). If the demand exceeds the inventory on hand, then the 
additional demand is backlogged and is filled when additional inventory becomes 
available. Thus, the backlogged units are viewed as negative inventory. The inv- 
entory left over at the end of the final period has a value of c per unit, and all 
unfilled demand at this time can be backlogged at the same cost c. As we shall 
see, these assumptions ensure that the expected (gross) revenue in each period is 
a constant, and therefore we will not include the revenue term in our formulation. 
Lost sales models are addressed in 9.6. 

Costs include ordering, holding, and backorder costs. Ordering cost consists of 
a setup cost, K , charged every time the warehouse places a replenishment order, 
and a proportional purchase cost c. There are a holding cost of h + for each unit 
of the inventory on hand at the end of a period and a backorder cost of h~ per 
unit whenever demand exceeds the inventory on hand. To avoid triviality, we 
assume h~ ,h + >0 (why?). The objective is to determine an inventory policy that 
minimizes the expected cost over T periods. In what follows, we show that an 
(st, St) policy is optimal. Of course, an (s t , St) policy is similar to the ( s , S) policy 
described earlier except that the parameters s and S may vary from period to 
period. 

To characterize the optimal policy for the finite-horizon model, we first develop 
a dynamic programming formulation of the problem. Let x t be the inventory level 
at the beginning of period t (before possible ordering). 

If the inventory level immediately after ordering is y , then the expected one- 
period inventory holding and backorder cost for that period is 

G(y) = j max(i/ — D, O)dF(D) + h~ j max(P — y, 0)dF(D), (9.2) 

J d J D 

which is called the one-period loss function. Since the maximum of convex func- 
tions is convex and convexity is preserved under integration, we see that G(y) is 
convex. 

Given a policy Y = (2/1, y 2 , • • • , Vt), where y t is the order- up-to level (random 
variable) of period t and may be contingent upon other variables, the sum of the 
total expected proportional purchasing cost and salvage value P^ is given by 

T 

= E [ c ( yt ~ Xt ) “ C (^ T - Dt ) ’ 
t= 1 

where D t is the realized demand in period t. Noting that x t +i = yt~ D t , we have 



= cE[yi — x \ + 7/2 — (2/1 — Di) H Yyr — (yr-i ~ Dt- 1) + D T — yr\ 

= cTE[D}. 
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Thus, is independent of the ordering policy, and we can drop off the linear 
ordering cost component from the formulation. This observation is quite intuitive, 
since all backlogged demand is filled at the end of the last period, while all rem- 
aining inventory left at this period is salvaged, both at the same price c. We also 
remark that whenever possible, we will suppress the subscript t from D t (because 
demands are iid) and y t . 

Recall that x t is the inventory level, prior to ordering, at the beginning of per- 
iod t. To formulate the dynamic program, define the following two expected cost 
functions. Let G t {x t ) be the expected cost for the remaining T — t + 1 periods if 
we do not order in period £, and act optimally in the remaining T — t periods. 
Let Ztixt) be the minimal expected cost incurred through the remaining T — t + 1 
periods if we act optimally in period t and all the remaining T — t periods. It follows 
that for t = 1, 2, . . . , T, 



G t (y) = G{y) + [ z t+1 (y - D)dF(D) (9.3) 

J D 

and 

z t (x) = Min v >x {K6(y — x) + G t (y)}, (9.4) 

where zt+i(x) = 0 for any x, and S(u) is 1 if u > 0 and it is 0 otherwise. 

Note that if we order up to the level y > x t in period £, the cost for the final 
T — t + 1 periods is K + Gt(y)- 

Notice that the functions Gt(y) and Zt(y) are not convex and may even have 
many local minima. In order to show that an (s, S) policy is optimal for this model, 
we employ the concept of iF-convexity, introduced by Scarf (1960), which provides 
us with a powerful tool to analyze stochastic inventory models with fixed ordering 
cost. 

9.3.2 K- Convex Functions 

Definition 9.3.1 A real-valued function f is called K- convex for K > 0 if for 
any xo < x\ and A G [0, 1], 

/(( 1 — A)xo + Xxi) < (1 — X)f(xo) + Xf(x\) + A K. (9-5) 

Below we summarize properties of IF-convex functions. 

Lemma 9.3.2 (a) A real-valued convex function is also 0-convex and hence K- 

convex for all K >0. A K\- convex function is also a K^- convex function 
for K x < K 2 . 

(b) If fi(y) and f 2 (y) are Ki-convex and K 2 -convex, respectively, then for a, 9 > 
0; a fi(y) + 9 f 2 iy) is {ctK I + /3K 2 )-convex. 

(c) If f(y) is K -convex and £ is a random variable, then E^[f(y — C)] is a ^so 
K -convex, provided E[\f(y — C)|] <00 for all y. 
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(d) Assume that f is a continuous K- convex function and f(y ) — )> oo as \y\ — >> oo. 
Let S be a minimum point of f and s be any element of the set 

{x\x<SJ(x) = f(S) + K}. 

Then the following results hold. 

(i) m + K = f(s ) < f(y), for all y < s. 

(ii) f(y) is a nonincreasing function on (— oo ,s). 

(Hi) f{y) < f(z) + K for all y , z with s < y < z. 

Proof. Parts (a), (b), and (c) are straightforward and are left as an exercise. Hence, 
we focus on part (d). 

Let S be a minimum point of function / and let s be any element of the set 
{x\x < S, f(x) = f(S) + K}. 

The existence of s and S is guaranteed since / is continuous and f{y) — oo as 

\y I °°- 

Consider any y and y' with y < y' < 5 ; there exists a A E [0, 1] such that 
y' = (1 — A )y + A S. The K- convexity of the function f{x) implies that 

fW) < (1 - A )f(y) + A (f(S) + K) = (1 - A )f(y) + A f(s). (9.6) 

Part (d) (i) follows from (9.6) upon letting y' = s , which immediately implies part 
(d) (ii). 

Finally, consider any y and z with s < y < z. Ii y < S, there exists a A E [0, 1] 
such that y = (1 — \)s + A S. Since f(x) is iC-convex and S is a global minimizer 
of the function /, we have 

f(y) < (1 - a )(/(*). ~K) + A f(S) + K = f(S ) + K < f(z) + K. 

If V > S, there exists a A E [0, 1] such that y = (1 — A )S + Az. Again, the K- 
convexity of the function / and the definition of S imply that 

f(y) < (1 - a )f(S) + Xf(z) + A K< f(z) + K. 



Figure 9.1 gives an illustration of the properties of iC-convex functions in 
Lemma 9.3.2 part (d). 

Proposition 9.3.3 If f {pc) is a K -convex function, then function 
4>(x) = Min y > x QS(y-x) + f(y) 



is ma x{K,Q} -convex. 
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Proof. We only need to discuss the case K > Q. In fact, when K < Q, the K- 
convexity of f(x) implies the Q-convexity of f(x), and the Q-convexity of the 
function (j){x) follows from the case for K > Q. Hence, we assume that K > Q. 

Let E = {x | cj)(x) = f(x)} and O = {x \ (f)(x ) < f(x)}. We show that for any 
xq,Xi and A G [0, 1] with xq < Xi, 

(/>(x\) < (1 - X)4>(x 0 ) + \</>(xi) + A K, 

where x\ = (1 — A)^o + Xx\. We consider four different cases. 

Case 1: xq,x% G E. In this case, 

00a) < /O a) 

< (1 — A)/(x 0 ) + Xf(xi) + XK 

= (1 - X)cf>(x 0 ) + X<f>(xi) + XK, 

where the second inequality follows from the iF-convexity of the function f(x). 

Case 2: xq,xi G O. In this case, let (j){xi) = Q + f(yi ) for i = 0, 1 with yi > Xi 
and let t/a = (1 — X)yo + At/i. It is clear that t/q < 2/i and y\ > x\. Furthermore, 

<t>(x\) < Q + f(y\) 

< (1 — A)(Q + f(yo)) + A(Q + f(yi)) + AiF 
= (1 — A)0(xo) + A0(xi) + AIT, 

where the second inequality follows from the iT-convexity of the function /(#). 

Case 3: xo G i G O. Let 0(aq) = Q + f(yi) with t/i > x\. Let x\ — (1 — 
y)x o + yyi with y < X. Then 

</>Oa) < f(x a) 

< (1 - »)f(xo) + m/(2/i) + 

= (1 — X)(f)(xo) + \(f)(xi) + yK 

+ (A - y)(f(xo) - f(yi)) - A Q 

< (1 — A)0(xo) + A0(xi) + yK — XQ + (A — /i)Q 

< (1 — A)0(xq) + A0(xi) + A IT, 
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where the second inequality follows from the if -convexity of the function f(x) and 
the third inequality holds since f{x o) < Q + f{y 1 ). 

Case 4: xo G O, x\ G E. Let 4>{xo) = Q + f(y o) for yo > xq. We distinguish 
between two different cases. 

Subcase 1: x\ < yo- In this case, 

00a) < Q + f(yo) 

= (1 — A)(Q + f{yo)) + A/0 i) + A(Q + /( 2 / 0 ) — /Oi)) 

< (1 — A)0Oo) + A0Oi) + AQ, 

where the last inequality holds since /(t/q) < f{x 1 ). 

Subcase 2: £y > 7 / 0 • Lot £y = (1 — /i)?/o + p>x\ with n < A. Then 

</>0 a) < /Oa) 

< (1 - M)/Oo) + m/Oi) + 

= (1 - A)0Oo) + A0Oi) + l^K 
+ (A - /i)(/0o) - /Oi)) - 0 - A)Q 
f (1 — A)0Oo) + A0Oi) + Aif, 

where the second inequality follows from the if -convexity of the function f {x) and 
the last inequality holds since f(yo) < / Oi)- I 

0 . 3 Main Results 

It remains to show that an policy is optimal for every £, t = 1, 2, . . . , T. 

For this purpose, it is sufficient to prove that the function Gt{y) is if-convex, and 
Gt{y) Goo as \y\ -G 00 , for each period £, £ = 1 , 2, . . . , T. 

Theorem 9.3.4 (a) For any t = 1,2, ...,T ; Gt{y) and zt(y) are continuous 

and lim^i^oo G t {y) = 00 . 

(b) For any t = 1, 2, . . . , T, G t (y ) and z t {y ) are if -convex. 

(c) For any t = 1, 2, . . . , T, there exist two parameters s t and S t such that it is 
optimal to make an order to raise the inventory level to S t when the initial 
inventory level is no more than s t and to order nothing otherwise. 

Proof. We prove by induction. For t = T, Gr{y) = G{y) for all y. Hence, Gr{y) is 
continuous and if-convex (in fact, convex), and lim^i^^ Gr{y) = 00 . 

Assume that G t {y) is continuous and if-convex, and lim^i^^ Gt{y) = 00 . Then 
Lemma 9.3.2 part (d) allows us to show that there exist two parameters s t and S t 
with s t < St such that S t minimizes G t (y ) and G t (s t ) = G t (S t ) + K. Furthermore, 



z t(y) = 



K + Gt(S t ), if y<s u 
Gt{y ), otherwise. 
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Since G t (s t ) = G t (S t ) + K, z t (y ) is continuous and Proposition 9.3.3 implies that 
zt(y ) is iF-convex. 

Finally, Gt-i(y) = G(y) + E[z t (y — D)]. Therefore, Gt-i(y) is continuous, and 
from Lemma 9.3.2 part (c), Gt-i(y) is iC-convex. Moreover, lim^i^^ Gt-i(y) = 
00 , since z t (y) > G t {S t ) for any y. I 

So far we assume that demands are identically distributed and the cost parame- 
ters, c, h + and h - , are time-independent. These assumptions can be easily relaxed 
and an (s, S) policy is still optimal. Indeed, in Chap. 10, we analyze the finite- 
horizon inventory and pricing model, including the inventory model analyzed in 
this section as a special case, under more general assumptions. 



9.4 Quasiconvex Loss Functions 

The above proof on the optimality of (s t , St) policies relies on the fact that the one- 
period loss function G(y) is convex. In many practical situations, this assumption 
is not appropriate. For instance, consider the previous model, but assume that 
whenever a shortage occurs, an emergency shipment is requested. Suppose further 
that this emergency shipment incurs a fixed cost plus a linear cost proportional 
to the shortage level. It can be easily shown that the new loss function G(y) is, in 
general, not convex. 

To overcome this difficulty, Veinott (1966) offers a different yet elegant proof for 
the optimality of (s t ,S t ) policies under the assumption that G(y) is quasiconvex. 
Here we provide a slightly simplified proof suggested by Chen (1996) for the model 
considered here. Recall the concept of quasiconvexity. A function / is quasiconvex 
on a convex set X if for any x and yGl and 0 < A < 1, 

f(Xx + (1 - A )y) < ma x{f(x),f(y)}. 

As we already pointed out in Chap. 2, a convex function is also quasiconvex, and 
/ is quasiconvex if 

—f(x) is unimodal. 

Consider the dynamic program (9.3)-(9.4). In the analysis below, we use the 
following assumptions on G(y). 

( i ) G(y) is continuous and quasiconvex. 

(ii) G(y) > inf x G{x) + K as \y\ — )> oc. 

Other assumptions on ordering costs and demands are the same as in the previous 
section. 

If (i) and (ii) hold, there is a number y* that minimizes G(y). In addition, there 
are two numbers s(< y*) and S(> y*) such that 



G(S) = G(y*)+K, 
G(s) = G(y*)+K. 



(9.7) 

(9.8) 
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It is also worth mentioning that G(y) is nonincreasing in y on (— 00 , ?/*] and non- 
decreasing in y on (y*, 00 ). 

To prove the optimality of an (s t ,St) policy for all £, we need the next two 
lemmas. 

Lemma 9.4.1 For t = 1, . . . , T, and y < y', 

zt(y) < zt(y') + K and (9.9) 

Gt(y') ~ G t (y ) > G(y') - G(y) - K. (9.10) 



Proof. It follows that 

z t (y) = min{G t ( 2 /), K + min^ G t (x)} 

< K + mm x > y G t {x) 

< K + mm x > y / G t (x ) 

<K + z t (y'). 

We also provide an alternative proof here. The result obviously holds for y' = y. 
Now assume that y' > y. Suppose that at the beginning of the period, the inventory 
level prior to any ordering is y. Consider the following strategy: We first raise the 
inventory level up to y' and then act optimally as if we started with the inventory 
level y' (prior to any ordering). Such a strategy incurs cost equal to K + z t (y'). 
Because this strategy is not necessarily optimal, it follows that 

zt(y) < K + z t (y r ), 



which also proves (9.9). 

Inequalities in (9.9) imply that 

G t (y') ~ G t (y ) = G(y') - G(y) + E[z t+1 (y' - D)] - E[z t+1 (y - D)] 

> G{y') - G(y) - K, 

which completes the proof. I 

A function / : 5ft — >> 3? is called non-i^-decreasing if for any x and x’ with x < 
f(x) < f(x') + K . The above lemma thus implies that z t (y) and G t (y) — G(y) are 
non-iC-decreasing. The following lemma, on the other hand, illustrates that z t (y) 
and G t (y) are nonincreasing for y < y*. 

Lemma 9.4.2 For t = 1, . . . , T and y < y' < y* , 

Gt W) ~ G t (y) < GW) ~ G{y) < 0 and (9.11) 

z t (y') < z t (y). (9.12) 



Proof. The proof is by induction. Note that G(y) is decreasing in y for y < y*. 
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For t = T, GT(y')—GT(y ) = G(y')—G(y) < 0, which implies that mim^/ Gt{x) 
- min^y Gt{x). Then 

zr{y') = min {G tO/) 5 K + min Gt(x)} 

x>y' 

< min{GT(2/), K + min Gt(x)} 

x>y' 

= min {G T (y), K + min G T (x)} = z T {y). 

x>y 

Assume that for t + 1 > 0 and y < y' < y * , 

G t +i(y') ~ G t +i(y) < G(y') - G{y) < 0 and 
zt+iW) < z t +i(y). 

Now it follows immediately that 

Gt(y’) ~ G t (y) = G(y') - G(y) + E[z t+1 (y ' - D)\ - E[z t+1 (y - D)} 

= G(y 0 - G{y) + E[z t+1 (y' - D) - z t+l {y - D)] 

< G(y') - G(y) < 0 (9.13) 

and 

z t (y') = min {G t (y'), K + min G t (x)} 

x>y' 

< mm{G t (y), K + min G t (x)} = z t (y). 

x>y 

This completes the proof. I 

We are now ready to show the optimality result. 

Theorem 9.4.3 (Veinott 1966) If (i) and (ii) hold, an policy is optimal 

for the model (9-4)- Moreover , s < s t < y* and y* < St < S. 



Proof. The proof proceeds in several steps. We start with the assumption that 
G t (y) is continuous in y. This assumption will be confirmed at the end. 

(1) St is a global minimizer of G t (y ). For this purpose, we first show that G t (y) 
is decreasing for y < y* , which follows directly from (9.11). Because Gt(y) is 
continuous, there exists a number S t that minimizes G t (y) over [y*,S]. Now it is 
clear that S t minimizes G t (y) on (— oo, S). By the definition of S and Lemma 9.4.1, 
it follows that for y > S(> y*), 

Gt(y) ~ G t (y*) > G(y) - G(y*) - K 

> G(S) - G(y*) - K = 0, 

where G(y) > G(S) due to the quasiconvexity of G(y). Hence, S t is indeed a global 
minimizer of G t (y) and y* < St < S. 
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(2) There exists a number s t such that 

G t (S t ) + K = Gt(st) and s<st < y*. 

The definitions of St, s, and y* imply that 

Gt(S t ) + K — G t {s) < G t (y *) + K — G t (s) 

< G(y*) + K — G(s) = 0, 

where the first inequality follows from the definition that S t is the minimizer of 
G t (y ), while the second inequality holds due to Lemma 9.4.2. From the definition 
of y * and Lemma 9.4.1, we see 

Gt(S t ) + K - G t (y*) > G(S t ) - G(y *) — K + K > 0. 

Together with the continuity assumption of G t (y) and the fact that G t (y) is 
decreasing on (— oo, y*], the above two inequalities imply that there exists a number 
s t such that 

G t (S t ) +K = G t (s t ) and s < s t < y*. 

(3) For y* < y < y', 

[K + G t (y')\ — Gt{y) > 0. 

This follows directly from Lemma 9.4.1 and the fact that G(y') > G(y): 

G t (y') ~ G t (y) > G(y') - G(y) -K>-K. 

Note that this observation implies that placing an order does not reduce the ex- 
pected cost when y > y * . 

(4) We conclude, therefore, that an ( s t ,St ) policy is optimal. 

(5) It remains to prove that Gt(y) is continuous in y. 

Again, we proceed by induction. It is true for t = T because Gr{y) = G(y) [by 
assumption (i)]. Suppose now that Gt+i(y) is continuous for t < T. From (4), 

( \ _ f K + G t +i(S t ) if y < s t , 

Zt ^-\ G m (y) if y>s t . 

Hence, z t + 1 is continuous. Finally, the continuity of E[zt+i(y — D)\ follows from the 
continuity of function Zt + 1 and the uniform continuity theorem, which basically 
says that a continuous function is uniformly continuous over a compact set. I 
The above proof for the optimality of (s t , St) policies is based on the assumption 
that demands are independent and identically distributed. If demands are not 
independent and identically distributed, Lemma 9.4.2 will generally fail to hold 
for the following reason. In the proof of Lemma 9.4.2, we require that z t +i(y' — 
D ) — z t +i(y — D) < 0 for all D in (9.13), which holds only if y — D < y' — D < y*. 
When demands are not independent and identically distributed, the minimizer of 
G(y) may vary from period to period, and the requirement that z t +i(y' — D) — 
z t +i(y — D) < 0 may not be met. In the proof based on IT— convexity, however, 
no requirement is imposed upon demands. Thus, while the result in this section 
is more general than the results of Sect. 9.3 when demands are independent and 
identically distributed, it is not a generalization of the first. 
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9.5 Infinite-Horizon Models 

In this section, we consider a discrete-time infinite- horizon model in which an order 
may be placed by the warehouse at the beginning of any period. To simplify the 
analysis, we focus on discrete inventory levels and assume a discrete distribution 
of the one-period demand D. Let pj = Pi{D = j} for j = 0, 1, 2, ... . The objective 
is to minimize the long-run expected cost per period. All other assumptions and 
notations are identical to those in the previous section. 

This problem has attracted considerable attention in the last three decades. The 
intuition developed in the previous section (for the finite-horizon models) suggests 
and is proved by Iglehart (1963b) and Veinott and Wagner (1965), that an (s, S') 
policy is optimal for the infinite-horizon case. A simple proof is proposed by Zheng 
(1991). Various algorithms have been suggested by Veinott and Wagner (1965), 
Bell (1970), and Archibald and Silver (1978) as well as others; see, for instance, 
Porteus (1990) or Zheng and Federgruen (1991). This section describes a simple 
proof for the optimality of a stationary (s, S ) policy given by Zheng (1991) and 
sketches an algorithm developed by Zheng and Federgruen (1991) for finding the 
optimal (s, S ) policy. We follow those papers, as well as the insight provided in 
Denardo (1996). 

Let c(s, S) be the long-run average cost associated with the (s, S) policy. Given a 
period and an initial inventory ?/, recall that the loss function G(y) is the expected 
holding and shortage cost minus revenue at the end of the period. In what follows, 
the loss function G(y) is assumed to be quasiconvex and G(y) — >> 00 as \y\ — >> 00 . 

Let M(j ) be the expected number of periods that elapse until the next order is 
placed when starting with s + j units of inventory. That is, M(j ) is the expected 
number of periods until the total demand is no less than j units. It is obvious that 
for all j, we have 

j 00 

M(j) = Y,Pk[l + M(j-k)}+ ]T Pk (9.14) 

k = 0 k=j- 1-1 

00 

= -k) + 1, 

k = 0 

with M(j) = 0 for j < 0. 

Let J~(s, y ) be the expected total cost in all periods until placing the next order, 
when we start with y units of inventory. 

Observe that since orders are received immediately, each time an order is placed, 
the inventory level increases to S. Hence, replenishment times can be viewed as 
regeneration points ; see Ross (1970). The theory of regeneration processes tells us 
that 

C ^’ S ^ = M(S - 1) ‘ ^ 9 ' 15 ^ 

That is, c(s, 5), the long-run average cost, is the ratio of the expected cost be- 
tween successive regeneration points and the expected time between successive 
regeneration points. 
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To calculate M(S — 5 ), one need only solve the recursive equation (9.14). In 
addition, 

T(s,S) = K + H(s,S), 

where H(s,y) is the expected holding and shortage cost until placing the next 
order, when starting with y units of inventory. How can we calculate the quantity 
H(s , 5)? For this purpose, observe that M(j + 1) > M(j ) and let 



m(j) =M(j + 1) - M(j), 



for j = 1 , 2, 3, — To interpret m(j), observe that for any j, j < S — s, M(j + 1) 
is the expected time until demand exceeds j units. Thus, the definition of M(j ) 
implies that m(j) is the expected number of periods, prior to placing the next 
order, for which the inventory level is exactly S — j. Hence, 

s-3-1 

H(s,S)= J2 m(j)G(S-j). (9.16) 

3=0 

An alternative way of computing H(s,y) is as follows: 



H(s,y) = G(y) + ^2pjH(s,y - j), for y > s, (9.17) 

3=0 



and H(s,y) = 0 for y < s. To summarize, for a (s, S) policy, we have 



C(8,S) 



K + T,j = 0 lm U)G(S-j) 
M(S — s) 



Let y* be any minimizer of the loss function G. Zheng and Federgruen’s algo- 
rithm as well as Zheng’s proof are essentially based on the following results, which 
characterize the properties of the optimal ( 5 , S) policies. 



Lemma 9.5.1 For any given ( 5 , S) policy , there exists another (s ' , S') policy with 
s' < y* < S' such that c(s ', S') < c(s , S ). 



Proof. Observe that G(y) is a quasiconvex function of y and therefore G(y) is 
nonincreasing for y < y* and nondecreasing for y > y* . Consider now s > y* . 
Equation (9.16) together with the quasiconvexity of G(y) implies that H(s — l,S — 
1) < H(s , S). Hence, c(s, S) > c(s — 1,5—1). Suppose now that y* > S. A similar 
argument shows that U(s + 1, 5 + 1) < H(s, 5), and hence c(s, S) > c(s + l, 5 + 1), 
which completes the proof. I 

The following result is useful for our analysis. 

Lemma 9.5.2 Assume s° < y* < 5. For a given p, we have that 

(a) If p < G(s°), then for any s < s° , there exists 0 < f3 < 1 such that 



c(s, S) > /3c(s°, S) + (1 - P)p. 
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(b) If p > G(s° + 1), then for any s° < s < y* , there exists 0 < /3 < 1 such that 

c(s°, S) < fic(s, S) + (1 - P)p. 



Proof. For part (a), let (3 = M(S — s°)/M(S — s) and observe that 0 < /3 < 1. 
From the definition of c(s , S), we have 



c(s,S) = 



> 



K + Ef= 0 ° 1 m U)G(S - j) + Ef=s-1 m (j)G{S - j) 
M(S-s) 

c(s°, S)M(S - s°) + m ti) G (S ~ j ) 

M(S-s) 

c( S °, S)M(S - s°) + T,j=s-l° m U)p 

M(S-s) 



= pc(s°,S) + (l-P)p, 



where the inequality holds since the loss function G is quasiconvex. Finally, the 
proof of part (b) follows from a similar argument and is left as an exercise. I 
We are ready to provide a useful characterization of the optimal reorder levels 
for a given order-up- to level. 

Lemma 9.5.3 For a given order-up-to level S, a reorder level s° < y * is optimal 
[i.e., c(s°, S) = min s <s c(s, S)] if 

G(s°) > c(s°, S) > G(s° + 1). (9.18) 

Similarly , for any order-up-to level S , there exists an optimal reorder level 8° such 
that s° < y * and (9.18) holds. 



Proof. The optimality of 8° for 8° satisfying (9.18) follows from Lemma 9.5.2 upon 
letting p = c(8°, S). 

We now prove the second part of the result. For any s < y* , there exists an 
8° < y* such that G(s°) > c(s, S) > G(s° + 1) since G(y) — > oo for y — >> oo and 
c(s , S ) > min x G(x). Upon letting p = c(s, 5), Lemma 9.5.2 implies that c(8°, S ) < 
c(s, S ). If 8° satisfies (9.18), then we are done; otherwise, there exist s 1 < y* such 
that 8 1 > 8° and G(8 1 ) > c(s°,S) > G(s 1 + 1). Again from Lemma 9.5.2, we 
have c(s 1 ,S) < c(s°,S). If s 1 satisfies (9.18), we are done; otherwise, repeat this 
process. This process has to be finite since y* is an upper bound, and thus we end 
up with a reorder point satisfying (9.18), which is optimal from the first part of 
the result. I 

An immediate byproduct of the lemma is an algorithm for finding an optimal 
reorder point 8° for any given S. 

Corollary 9.5.4 For any value of S, s° = ma x{y < y*\c(y,S ) < G(y)} is the 
optimal reorder level associated with S. 
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Proof. Let 

_ M(S-s-l) 

“ ~ M(S — s) 

and observe that (9.15) and (9.16) imply that 

c(s, S) = ctc(s -\- 1, S) (1 — rr)Cr(s -1- 1). (9.19) 

The definition of 5° implies that 

G(s°) > c(s°, S) and G(s° + 1) < c(s° + 1, S ). 

In addition, using (9.19), we have c(s°,S) > G(s° + 1). Hence, (9.18) holds and 
by Lemma 9.5.3, 5° is an optimal reorder point associated with S. I 

Lemma 9.5.5 For two order-up-to levels S°, S > y* , let s° and s with 5°, s < y* 
be the corresponding optimal reorder points , respectively. Moreover, assume that 

G(s°) >c{s°,S°) > G(s° + 1). 

The (s, S) policy improves on (has smaller cost than) (s°,S°) if and only if 

c(s°,S) <c(s°,S°). 



Proof. We only need to show that if c(s, S) < c(s°, S°), then c(s°, S) < c(s°, S°). 
By contradiction, assume c(s°,S) > c(s°,S°). Upon letting p = c(s°,S) > 
c(s°, S°) > G(s° + 1), we have from Lemma 9.5.2 part (b) that 

c(s, S) > c(s° , S) > c(s ° , S°), 



which is a contradiction. I 

Finally, we provide a characterization of the optimal order-up-to level for a given 
reorder level s. For this purpose, define 



, c n\ _ / 0, if i < s, 

(l>(i,s,S) - | G(i) -c(s,S) 



+ YZLoPjW ~ 3, s > S), otherwise . 



(9.20) 



From the recursive forms (9.14) and (9.17), we have that for i > s, 



4>{i , s, S) = H(s, i) — c(s , S)M{i — s) 



and 0(S, 5, S) = -iG 

Lemma 9.5.6 For a given reorder level s, if an order-up-to level S° is optimal 
( c(s , S°) = infos' c(s, S ) ), then c(s , S°) > G(S°). 
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Proof. Assume that c(s, 5) < G(S ) for some 5 > s. Then there exists an inventory 
level i with s < i < S such that — K = 0(5, s, 5 ) > 0(0 s , 5). This implies that 



c(s,5) > 



K + H(s,i) 
M(i - s) 



c(s,i). 



Thus, 5 cannot be optimal. I 

From the above proof, we can also see that if an order-up-to level 5 is opti- 
mal for a given reorder level 8, then 0(0 8, 5) > —K = 0(5', 8, 5) for any i. The 
following characterization of the properties of the best (s, 5) policy is an imme- 
diate consequence of Lemma 9.5.1, Lemma 9.5.3, Lemma 9.5.6, and the above 
observation. 



Lemma 9.5.7 There exists an (s* , 5*) policy such that the following hold: 

(a) c* m c(s*, 5*) = inf S < s c(s , 5); 

(b) s* <y* < 5*; 

(c) G(s*) > c* > G(s* + 1); 

(d) G(S*) < c*; 

(e) <f>{i) > —K = 0(5*) for any i, where 0(i) = 0(0 8*, 5*). 

Furthermore, these results suggest the following simple algorithm. Start with 
5° = y* and find the best reorder point 8° applying Corollary 9.5.4. Now increase 5 
by increments of 1 each time comparing c(8°, 5°) to c(8°, 5). If c(8°, 5) < c(8°, 5°), 
set 5° = 5 and find the corresponding reorder point. Continue until you’ve identi- 
fied (8°, 5°) such that no 5, 5 > 5° has c(s°, 5) < c(s°, 5°) and G(S) > c(s°, 5°). 

So far we’ve characterized the properties of the best (8,5) policy, the (8*, 5*) 
policy, and we’ve described how to find such a policy. We are now ready to prove 
that this stationary (8*, 5*) policy is optimal for the infinite- horizon model. Of 
course, as is common for the general infinite-horizon dynamic program, one might 
attempt to prove that there exists a function h such that the following optimality 
equation holds: 



oo 

h(x) + c* = Min y > x {K5(y - x) + G{y) + 'Y^p j h{y - j)}. (9.21) 

3 - 0 

In fact, one can prove that the function 0 defined in Lemma 9.5.7 satisfies the above 
optimality equation (9.21). Unfortunately, since the function h is unbounded, there 
is no result in dynamic programming that allows us to claim the optimality of the 
stationary (8*, 5*) policy without further justification. Hence, we follow a different 
approach. In particular, we focus on a relaxed model where negative order is 
allowed and whenever a negative order is placed, a fixed cost K is charged. 
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We construct a bounded function h satisfying the optimality equation for the 
relaxed model 

oo 

h(x) + c* = Min y {KS(\y - z|) + G(y) + 'Y^p j h(y » j)}. (9.22) 

i= o 

The construction of function h is as follows: 

( 0, if i < s* 7 

h(i) = < O(i), for 8* < i < S'*, (9.23) 

[ min{0, O(i)}, otherwise, 

where 0(i) = G(i ) - c* + Ejlo Pjh(i ~ j)- 

We now prove that — K < h(i ) < 0 for any i. First notice that O(i) = cf)(i) 
for i < S*; hence, from Lemma 9.5.7 part (e), we have that h(i) > —K for 
i < S*. Moreover, using Lemma 9.5.7 part (c), we can show that h(i) < 0 for 
any 8* < i < S* and consequently, h(i) < 0 for any i. Thus, it suffices to prove 
0(i) > —K for i > S*. Assume to the contrary that there exists an i' such 
that 0(i f ) < —K, and without loss of generality let i f be the smallest one. Then 
h(i) > —K for any i < i' . In addition, there must exist an i" such that S* < i" < i' 
and 0(i") > 0; otherwise, for any i with 8* < i < i', 0(i) < 0, and therefore 
h{i) = 0(i) = (j)(i) > —K from Lemma 9.5.7 part (e). This implies that G[i ") > c*. 
However, since G is quasiconvex and i" > S* > y* , we can prove by induction that 
h(i) > —K for any i. This is a contradiction since h(i f ) = 0(i') < —K. Hence, 
— K < h(i) < 0 for any i. 

In summary, —K < h(i) < 0 for any i, h(S *) = —K and 0(i) > —K for any 
i > 8*. It is straightforward to verify that h(i) satisfies the optimality equation of 
the relaxed model (9.22), and a modified (s* , S*) policy attains the minimization 
in the optimality equation. In the modified policy, make an order to raise the 
inventory level to S* whenever the initial inventory level is no more than 8*; do 
not make any order when the initial inventory lies between 8* + 1 and S*; for an 
inventory level above S'*, make a negative order to reduce the inventory level to 
S* or do nothing depending on which choice is more cost-effective. 

We claim that the modified (8*, S*) policy is optimal for the relaxed model and 
its associated long-run average cost c* is optimal. Indeed, this claim follows from 
well-known results for infinite-horizon dynamic programming under an average 
cost criterion since as we just proved, the function h is bounded; for details, one 
may refer to any standard dynamic programming textbook, for instance, Theo- 
rem 2.1, p. 93, in Ross (1983). Also observe that the modified (s*,S*) policy is 
different from the (8*, S*) policy in at most one period: When the initial inventory 
level is too high, we may make a negative order to reduce the inventory level to 
S* and after that the inventory level will never exceed S* . Because the outcome 
of a finite number of periods will not affect the long-run average cost, it is safe to 
claim that the stationary (s*,5*) policy is optimal for the relaxed model and its 
associated cost c* is the optimal average cost. Finally, notice that the stationary 
(s*, 5*) policy is feasible for the original model, and the optimal average cost of 
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the original model is no less than the optimal average cost of the modified model. 
Thus, this stationary (s* , S*) policy is optimal for the original infinite- horizon 
model and its associated cost c* is the optimal long-run average cost. 



9.6 Models with Positive Lead Times 

So far we have assumed zero lead times. Under this assumption, if demand that 
cannot be filled right away is lost instead of being backlogged, the analysis in the 
previous sections can be modified and our main results are still valid (you are asked 
to show that the analysis in Sect. 9.3 can be carried over to lost sales models in 
the exercise). However, if a fixed delivery lead time has to be incorporated, there 
is a significant difference between lost sales models and backlogging models. 

For the case with backlogging, the models discussed in the previous sections but 
allowing for positive lead times can be transformed into corresponding ones with 
zero lead time using a fairly simple cost accounting scheme, proposed by Scarf 
(1960) for the finite-horizon models. In this scheme, the cost allocated to period 
t is the ordering cost plus the expected inventory holding and backorder cost of 
period t + l instead of period £, where l denotes the lead time. The rationale behind 
this is that the inventory cost incurred by the ordering decision at period t will 
only take effect after the order arrives l periods later. 

To calculate the inventory holding and backorder cost of period t + /, simply 
observe that it only depends on the inventory position at the warehouse, defined 
as the inventory at that warehouse plus inventory in transit to the warehouse, at 
period t. Indeed, the on-hand inventory level at period t + l is the difference of the 
inventory position at period t immediately after placing the order, referred to as 
the inventory order-up-to position, and the cumulative demand from period t to 
period t + l — 1. Thus, given the inventory order-up-to position y at period t, we 
can write the expected inventory holding and backorder cost of period t + l as 

G(y) = j max(|/ — l), O)dF(D) + h~ j max(Z) — y, 0)dF(D), (9.24) 

J d Jd 

where F is the cdf of the total demand during the lead time plus one period. 

A backlogging model with a positive lead time l can then be transformed into 
a new one with zero lead time in which G(y) is treated as the one-period loss 
function. With this understanding, the dynamic program (9.3)-(9.4) in Sect. 9.3 
can be modified by replacing the loss function G(-) with G(-) so that for t = 
1,2 — /, 

G t (y) = G(y) + [ zt+i(y - D)dF(D) 

J D 

and 

Zt(x) = Min y >x {K5(y - x) +G t (y)}. 

It is not hard to show that the optimal inventory order-up-to position can be deter- 
mined by solving the above dynamic program, and all the analysis and structural 
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properties in Sects. 9.3 and 9.4 can be carried over to this new dynamic program. 
For the infinite- horizon counterpart, it suffices to replace the loss function G(-) by 
G(-) without changing the analysis in Sect. 9.5. 

Unfortunately, this observation no longer works for lost sales models. In fact, 
though the inventory cost incurred by the ordering decision at period t is still 
a function of the on- hand inventory level at period t + /, its dependence on the 
inventory at the warehouse and in transit to the warehouse at period t is more 
complicated, and the inventory position at period t alone is not sufficient. Conse- 
quently, the structure of the optimal policy is much more complex. In the remainder 
of this section, we focus on a stochastic inventory model with lost sales, positive 
lead time, and zero setup cost. Our analysis is based on Zipkin (2008), in which 
the concept of Z^-convexity introduced in Chap. 2 plays a critical role. 

To completely describe the inventory system, at the beginning of period t after 
receiving the order placed l periods ago but before placing an order at the cur- 
rent period, let Si (i = 0, ...,/ — 1) be the inventory level at the warehouse plus 
the amount of inventory that will arrive within i periods. The state of the inv- 
entory system can be described by s = [so, M, • • • , . s/-i]. Note that s/_ i is the 
inventory position before placing an order. Let si be the inventory order- up-to 
position. Given the realized demand D at period t, the state of the next period, 
S = [So, Si, . . . ,si-i\, is given by 



Si = Si + 1 — sq A D V i = 0, 1, . . . , l — 1, 

and the expected cost for the remaining T — t + 1 periods immediately after an 
order is placed to raise the inventory position to si if we act optimally in the 
remaining T — t periods can be represented as 

Gt(s, si,D) = h + max(so — D, 0) + h~ ma x(D — so, 0) + zt+ i(s). 

The expected cost incurred through the remaining T — t + 1 periods if we act 
optimally in period t and the remaining T — t periods, z t (s), can then be derived 
by the following dynamic program: 

z t (s) = Min Sl > Sl _ 1 {c{si - S;_i) + E[G t {s, sj, £>)]}, 

where the feasible set of the states is 

S = {(s 0 , Si, , S/_i) : 0 < So < Si < . . . < S;_i}. 

Note that unlike the backlogging models, we need to keep the linear ordering cost 
component in the formulation. 

Lemma 9.6.1 For any s G S and si > si- z t (s ) and Gt(s, si,D) are L^-convex. 

Proof. Clearly, S is -convex from Proposition 2.3.3 part (e), and so does the set 
{(s, s/) : s G S,si- 1 < si}. From Proposition 2.3.4 parts (c) and (e), it suffices to 
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show the Z^-convexity of G t (s, si,D). For this purpose, we claim that G*(s, s/, D) 
equals the optimal objective value of the following optimization problem: 

Min h+(so — u) + h~(D — u) + £*+ 1(52 — u , . . . , i — u,si — u ) 
s.t. 0 < u < D, 
u < So- 

To see it, note that the optimization problem allows the firm to hold inventory even 
when there is unsatisfied demand. However, since we face a stationary system, it 
can never be optimal to hold inventory and reject demand at the same time since it 
is more cost-effective to fill one-unit current demand than one-unit future demand 
by avoiding any additional holding cost. Therefore, the optimal u is mm(D, sq) 
and our claim is correct. Finally, Proposition 2.3.4 part (d) implies that the above 
objective function is Z^-convex. Thus, Proposition 2.3.4 part (e) is applicable and 
G*(s, si,D) is Z^-convex. I 

Having established the Z^ -convexity of z t (s) and G t {s, si,D ), we can derive the 
monotonicity of the optimal inventory order-up-to position s*(s) and the optimal 
ordering quantity s*(s)—si- 1 . Let xo be the on- hand inventory and Xi the inventory 
in transit that is to arrive in i period (i > 0) and x = [xo, X \, . . . , xi- 1 ]. 

Theorem 9.6.2 The optimal inventory order-up-to position s*(s) is increasing 
in s. However, s"[{s + £e) < s*(s) + £ for any £ > 0. Thus, the optimal ordering 
quantity x"[ (x) as a function of x satisfies the following inequalities: 

x* (x) > Xi (x + f ei) > Xi {x + £e 2 ) > . . . > x\ (x + i) > (x) - £ V f > 0, 

where is the unit vector with 1 at the ith element. 

Proof. The claim on the optimal inventory order-up-to position s*(s) follows dir- 
ectly from Lemma 2.3.5. To prove the claim on the optimal ordering quantity 
x*(x), note that x*(x) = «s z *(s ) — si- Thus, for any i < l — 2 and f > 0, 

x*i (x + £e i+ i) = s z * (s 0 , . . . , Si, s*+a + f, . . . , s*_ i + 0 - s *_ t - £ 

< s* (so, • • • , Sj-l, Si + Si - 1 + e - S;_i - £ 

= a;*(a: + £ej) 

< s* (s + £e) — si-i -£ 

< s z *(s) - Si -1 
= x*l(x), 

where e is the all-ls vector. Here the first two inequalities hold since 5 z *(s) is 
increasing in s, and the last inequality holds since s (s + £e) < s*(s) + £ for £ > 0. 
To prove that x^{x + £e/_i) > Xi(x) — £, note that 

x*i (x + £ei- 1 ) = s,* (s + Cez-i) - s ; _i - £ > s,* (s) - s*_i -£,=x* l (xj - f , 



where the inequality follows from the monotonicity of s*(s) in s. 
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If x i (x) is differentiable, the above theorem implies that 

0 ^ dx\{x) ^ dxfjx) > > dxjjx) > ^ 

~ dxi ~ 0x2 — — dxi- 1 

t hat is, the optimal ordering quantity has bounded monotone sensitivities. Specifi- 
cally, it decreases in the on- hand inventory and the inventory in transit. In addition, 
it is more sensitive to newer outstanding orders than older outstanding orders and 
the on-hand inventory with bounded sensitivities. 



9.7 Multi-Echelon Systems 

Consider a distribution system with a single warehouse, denoted by the index 0, 
and n retailers, indexed from 1 to n. Incoming orders from an outside vendor with 
unlimited stock are received by the warehouse that replenishes the retailers. We 
refer to the warehouse or the retailers as facilities. The transportation lead time 
to facility i = 0, 1, 2, . . . , n, is a constant Li. 

As in the previous section, we analyze a discrete-time model in which customer 
demands are independent and identically distributed and are faced only by the 
retailers. Every time a facility places an order, it incurs a setup cost Ki, i = 

0. 1.2, The echelon inventory holding cost (see Chap. 7) is h\ at facility 

1, % = 0,l,2,...,n. Finally, demand is backlogged at a penalty cost of hf,i = 
1,2 , . . . , n, per unit per period. The objective is to find a centralized strategy, that 
is, a strategy that uses systemwide inventory information, so as to minimize the 
long-run average system cost. 

As the reader no doubt understands, the analysis of stochastic distribution 
models is quite difficult and finding an optimal strategy is close to impossible; 
consider the difficulty involved in finding an approximate solution for its deter- 
ministic, constant-demand counterpart; see Chap. 7. As a result, limited literature 
is available. The rare exceptions are the approximate strategy suggested by Eppen 
and Schrage (1981) and the lower bounds developed by Federgruen and Zipkin 
(1984a-c) and Chen and Zheng (1994). We briefly describe these two bounds here. 

For this purpose, let the echelon inventory position at a facility be defined as 
the echelon inventory at that facility plus inventory in transit to that facility. 

Consider the following approach suggested by Federgruen and Zipkin (1984a-c). 
Given an inventory position yi at retailer z, let the loss function Gi(yi) be 

Gi(yi) = hf max{0, - D} + ( h~ + ) max{0, D - yi}, 

where D is the total demand faced by retailer i during Li + 1 periods (see the end 
of the previous section for a discussion). 

Consider now any inventory policy with echelon inventory of y units at the 
warehouse and inventory position yi at retailer i. The expected one-period holding 
and shortage cost in the system is 

n 

G{y) =h£(y- n) + 'YjG i (y i ), 

i= 1 
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where fi is the expected single-period systemwide demand. Since, by definition, 
V — 1 Vii a l° wer bound on G(y) is obtained by finding 

n n 

Go(y) = Min yu ..., yn {h£(y - fi) + '^2G i (y i )\'^2yi < y|. (9.25) 

i= 1 i= 1 

Thus, a lower bound on the long-run average system cost C FZ is obtained by 
solving a single-facility inventory problem with loss function Go and setup cost 
Kq. Notice that this bound does not take into account the retailer-specific setup 
costs. This is incorporated in the next-lower bound of Chen and Zheng (1994). 

To describe their lower bound, consider the following assembly- distribution sys- 
tem associated with the original distribution system. In the assembly-distribution 
system, each retailer sells a product consisting of two components. A basic com- 
ponent, denoted by ao, and a retailer-specific component, denoted by a*. Each 
retailer receives component ao from the warehouse, which receives it from the 
outside supplier. On the other hand, component a^ is supplied directly from the 
vendor to retailer i. The arrival of a basic component at retailer i is coordinated 
with the arrival of component a^. That is, at the time the warehouse delivers basic 
components to retailer i, the same number of ai components is shipped to the re- 
tailer from the supplier. These two shipments arrive at the same time and the final 
product is assembled, each containing one basic component and one a^ component. 

To ensure that the original distribution system and the assembly-distribution 
system are, in some sense, equivalent, we allocate cost in the new system as follows. 
Associated with retailer i is a single- facility inventory model with setup cost LQ, 
holding cost hf , and shortage cost +h~[ . The delivery lead time to the facility is 
Li and the demand is distributed according to the demand faced by retailer i. This 
is, of course, a standard inventory model for which an (si,Si) policy is optimal. 
Let Ci be the long-run average cost associated with this optimal policy. Given an 
inventory position y, let Gi(y) be the associated loss function. Finally, let 



G\{y) 



Ci if y < Si, 
Gi(y ) \iy>Si, 



and G°i(y) = Gi{y) -G\(y). 

In the assembly-distribution system, costs are charged as follows. A setup cost 
Kq is allocated to the basic component and a setup cost Ki to each component a*, 
and an expected holding and penalty cost, that is, loss function, of G ? to the basic 
component and G\ to component a^. Notice that since shipments are coordinated, 
there is no difference between the long-run average cost in the original system and 
in the assembly-distribution system. 

To find a lower bound on the long-run average cost of the original system, 
we consider a relaxation of the assembly-distribution system in which the basic 
components can be sold independently of the other components. Thus, G*, i = 
1,2 ,...,n, is exactly the long-run average cost associated with the distribution of 
component a*. Let Go be a lower bound on the long-run average cost of the basic 
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component. Consequently, Y^i=o ^ * s a l° wer bound on the long-run average cost 
of the original distribution system. 

It remains to find Co- This is obtained following the approach suggested by 
Federgruen and Zipkin and described above. For this purpose, we replace G{ by 
G ? in (9.25) and take C FZ as Cq. 



9.8 Exercises 



Exercise 9.1. In (9.1), we assume that F(D) is continuous. Now suppose that 
F(D) is not necessarily continuous. Does there exist an S such that z(y) is mini- 
mized at y = 57 If there exists such an S', how can you determine it? 

Exercise 9.2. Prove (9.19). 

Exercise 9.3. It is now June and your company has to make a decision regarding 
how many ski jackets to produce for the coming winter season. It costs c dollars 
to produce one ski jacket, which can be sold for r dollars. Ski jackets not sold 
during the winter season are lost. Suppose your marketing department estimates 
that demand during the season can take one of the values Di, D 2 , . . . , D&, k > 3. 
Since this is a new product, they do not know what probabilities to attach to each 
possible demand Df, that is, they do not have estimates of the probability that 
demand during the winter season will be i = 1,2 ,...,&. They have, however, 
a good estimate of average demand fi and the variance of the demand a 2 . Your 
objective is to find the production quantity y that will protect you against the 
worst probability distribution possible while maximizing profit. For this purpose, 
you would like to consider the following optimization model: 

MAXIMIZE y MINIMIZE Pl ..., Pk ev Average Profit, (9.26) 

where V is the set of all possible discrete distribution functions with mean fi and 
variance a 2 . 

(a) Write an expression for the average profit as a function of the production 
quantity y and the unknown probabilities pi,p 2 , • • • ,Pk- 

( b ) Suppose we have already determined the production quantity y. Write a 
linear program that identifies the worst possible distribution, that is, the 
one that minimizes average profit. 

(6) Given a value of p, characterize the worst possible distribution; that is, iden- 
tify the number of demand points that have positive probabilities in the 
probability distribution found in the previous question. 

(c) Can you formulate a linear program that finds the optimal production quan- 
tity; that is, can you write a linear program that solves equation (9.26)? 
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Exercise 9.4. Consider the following discrete version of the newsboy problem. 
Demand for product can take the values Di, D 2 , . . . , D n , n > 3, with probabilities 
Pi,P2, • • • iPm where Y^h=i Pi = 1. Let r be a known selling price per unit and 
c be a known cost per unit. Our objective is to find an order quantity y that 
maximizes expected profit. Prove that the optimal order quantity that maximizes 
the expected profit must be one of the demand points, Di, D 2 , . . . , D n . 

Exercise 9.5. Prove Lemma 9.3.2, parts (a), (b), and (c). 

Exercise 9.6. Consider the newsboy problem with demand D being a random 
variable whose density, /(D), is known. Let r be a known selling price per unit 
and c be a known cost per unit. Assume no initial inventory and no salvage value. 
The objective is to find an order quantity y that maximizes expected profit. 

(а) Let a service level be defined as the probability that demand is no more 
than the order quantity, y. Our objective is to find the order quantity, y, 
that maximizes expected profit subject to the requirement that the service 
level is at least a. What is the optimal order quantity as a function of a, c, 
r, and /(D)? 

( б ) Suppose there is no service-level requirement; however, there is a capacity 
constraint, (7, on the amount we can order. That is, the order quantity, 
y, cannot be more than C. What is the optimal order quantity, y, that 
maximizes expected profit subject to the capacity constraint, Cl 

( c ) Suppose there is a service- level requirement, a, and a capacity constraint, 
C . What is the optimal order quantity, y , that maximizes expected profit 
subject to the constraints that service level is at least a and the capacity 
constraint, Cl 

Exercise 9.7. Prove that a function / : 3? — >> 5ft is if-convex if and only if for any 
z > 0 , b > 0 , and any y , we have 

K + f(y + z) > f(y ) + | (f(y) - f(y - b)). 

Exercise 9.8. Prove Lemma 9.5.2, part (b). 

Exercise 9.9. (Pang 2011) If a function / : 5R — 3? is K- convex and non -K- 
decreasing, then for A > 7 > 0, the function /( \x + — y(— x) + ) is IT-convex. Use 
this result to show that the analysis in Sect. (9.3) can be carried over to lost sales 
models with zero lead time. 
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Integration of Inventory and Pricing 



10.1 Introduction 

In the previous chapters, we analyzed the traditional inventory models, which focus 
on effective replenishment strategies and typically assume that a commodity’s price 
is exogenously determined. In recent years, however, a number of industries have 
used innovative pricing strategies to manage their inventory effectively. For exa- 
mple, techniques such as revenue management have been applied in the airlines, 
hotels, and rental car agencies — integrating price, inventory control, and quality of 
service; see Kimes (1989). In the retail industry, to name another example, dynam- 
ically pricing commodities can provide significant improvements in profitability, as 
shown by Gallego and van Ryzin (1994). 

These developments call for models that integrate inventory control and pricing 
strategies. Such models are clearly important not only in the retail industry, where 
price-dependent demand plays an important role, but also in manufacturing en- 
vironments in which production/distribution decisions can be complemented with 
pricing strategies to improve the firm’s bottom line. 

The coordination of replenishment strategies and pricing policies has been 
the focus of many papers, starting with the work of Whitin (1955), who analyzed 
the celebrated newsvendor problem with price-dependent demand. For a review, 
the reader is referred to Eliashberg and Steinberg (1991), Petruzzi and Dada 
(1999), Federgruen and Heching (1999), Yano and Gilbert (2002), Elmaghraby 
and Keskinocak (2003), Chan, Shen, Simchi-Levi and Swann (2004), or Chen and 
Simchi-Levi (2012). 
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In this chapter, we review some of the main progress on stochastic models. 
We first present regularity conditions on demand modeling and some commonly 
used demand models in Sect. 10.2. We then analyze the single-period models in 
Sect. 10.3. The finite- horizon models based on Chen and Simchi-Levi (2004a) are 
analyzed in Sect. 10.4, followed in Sect. 10.5 by an alternative approach due to 
Huh and Janakiraman (2008). A brief description of extensions and challenges is 
presented in Sect. 10.6. Finally, in Sect. 10.7, we focus on risk-averse inventory 
(and pricing) models based on Chen, Sim, Simchi-Levi and Sun (2007). 



10.2 Demand Models 

To make optimal pricing decisions, it is pivotal that we know the volume of a 
product that customers are willing to purchase at a specific price. The relation- 
ship between the volume and price gives rise to a demand model. The demand of 
a product can depend on many variables other than price, such as quality, brand 
name, and competitor’s prices; these variables change in each scenario. Here, how- 
ever, we restrict our discussion to demand models of a single product in which 
price is the only variable. 

Economic theory provides us with basic demand models derived from the clas- 
sical rational choice theory of consumer behavior (we refer to van Ryzin 2012 and 
Chap. 7 in Talluri and van Ryzin (2004) , as well as the references therein for 
more details). Built upon this theory, several regularity conditions are imposed 
on deterministic demand functions of a single product. Let \p,p\ be the feasible 
domain of price. 

Assumption 10.2.1 For a given selling price p E \p,p\, the demand function, 
D(p), satisfies the following conditions: 

(a) D(p) is continuous in p. 

(b) D(p) is strictly decreasing in p and thus has an inverse .D _1 (d). 

(c) D(p) £ [0, +oo). 

(d) The revenue, D~ l {d)d, is concave in d. 

These regularity conditions are quite intuitive and not restrictive in most cases. 
They are imposed to avoid unnecessary technical complications. Some commonly 
used deterministic demand functions include 

• the linear demand d(p) = b — ap for p E [0, b/a] (a > 0 and b > 0), 

• the exponential demand d(p) = e b ~ ap (a > 0 and b > 0), 

• the iso-elasticity demand d(p) = bp~ a (a > 1 and b > 0) [note that the price 

elasticity of demand, e(p), is the relative change in demand in response to a 
relative change in price, i.e., e(p) = — ] , 
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• the Logit demand d(p) = Cl ^f _ e - av , which is the product of the market size 
Cl and the probability that a customer with a coefficient of price sensitivity 
a buys at price p. 

It is straightforward to check that the above demand functions satisfy the regu- 
larity conditions in Assumption 10.2.1. 

In stochastic settings, the demand of a product, denoted by D(p, e), is often 
represented as a function of the price p and a random noise e independent of p. 
Sometimes it is important to specify the format that the random noise e enters the 
demand function. For our purpose, we focus on stochastic demand models under 
the following assumptions. 

Assumption 10.2.2 For a given price p , the demand is given by 

Dip, e) = aD{p) + /?, (10.1) 

where D(p) satisfies Assumption 10.2.1, e = (a, /3), and a is a nonnegative random 
variable with E[a\ = 1 and E[/3 } =0. 

An implicit assumption here is that the realized demand D(p, e) is always non- 
negative, which imposes some conditions on the selling price and the two random 
variables a and /3. Observe that, by scaling and shifting, the assumptions E[a\ = 1 
and E[(3] =0 can be made without loss of generality. 

A special case of this demand function is the additive demand function. In this 
case, the demand function is of the form D(p , e) = D(jp) + /3. Another special case 
of the demand function (10.1) is the model with multiplicative demand. In this 
case, the demand function is of the form D(p,e) = aD(p ), where cu is a random 
variable. 

Observe that for additive demand in which a = 1, the demand variance is in- 
dependent of the price, while the coefficient of variation (the ratio of standard 
deviation and mean) is dependent on the price. In contrast, for multiplicative de- 
mand in which /3 = 0, the coefficient of variation does not depend on the price 
while the variance does. In single-product settings with decreasing expected de- 
mand d(p), a higher price leads to a higher uncertainty for additive demand but a 
lower uncertainty for multiplicative demand. 



10.3 Single-Period Stochastic Models 

We start by analyzing a single-period problem in which a risk-neutral retailer has 
to decide on its stock level and the selling price of a single product. In this problem, 
demand is stochastic and depends on the selling price. In particular, we assume 
that the demand follows Assumption 10.2.2. An ordering and pricing decision is 
made before the realization of the demand uncertainty. The unit ordering cost is c 
and unsatisfied demand is filled with an emergency order. Let h{x) be the inventory 
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holding/disposal cost or the emergency ordering cost when the inventory level after 
satisfying the demand is x. A common form of h{x) is as follows: 

h{pc) = h + max(0, x) + h~ max(0, —x), (10.2) 

where h + is the unit inventory holding/disposal cost if h + is nonnegative or the 
unit salvage value if it is negative, and h~ is the unit cost for the emergency order. 
We assume that h(x) is convex and 0 is a minimizer of the function cx + h(x). For 
h(x) having the particular form (10.2), the above assumptions imply that 

h~ >c> max{0, — h + }. 

That is, the salvage value is no more than the normal unit ordering cost, which in 
turn is no more than the unit cost for the emergency order. 

For a given stock level y and a selling price p, the expected profit of the retailer 
is calculated as follows: 

v(y,p) = E\pD(p, e)] -cy- E[h{y - D(p, e))]. 

Assumption 10.2.2 implies that there is a one-to-one correspondence between the 
selling price p and the expected demand d. Thus, we have an equivalent represen- 
tation for the retailer’s expected profit: 

4>{y, d) = R{d) - cy - E[h(y - ad - /3)]. 

The objective of the retailer is to find a stock level and a selling price, correspond- 
ingly an associated expected demand, so as to maximize the retailer’s expected 
profit, namely, 

max _ </>(?/, d), (10.3) 

y>0,de \d,d] 

where d and d are the lower and upper bounds of the expected demand corre- 
sponding to the upper and lower bounds of the selling price; that is, 

d = Z} -1 (p), and d = T> -1 (p). 

Notice that 0(p, d) is jointly concave in y and d, and hence the above optimization 
can be solved efficiently. 

Our intention here is to compare the selling prices under deterministic and 
stochastic demands. In particular, we show that there is a significant difference 
between the additive demand case and the multiplicative demand case. Before we 
proceed to our main result of this section, we need the following lemma. 

Lemma 10.3.1 Let f be a convex function over SR. Then for any x,d,rj > 0, 

E[f{x — ad)] < E[f(x + r] — a(d + 77 ))], 



where a is a nonnegative random variable with E[a\ = 1. 
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Proof. Notice that a convex function has nondecreasing differences. Hence, we have 
that for any x, d, 77, a > 0, 

f(x - ad) - /( x -ad -(a - 1 )r?) < f(x) - f(x - (a - 1 )rf). 

Taking expectation on both sides of the above inequality and using Jensen’s in- 
equality give us the result. I 

Now we are ready to present one of our main results of this section. For sim- 
plicity, we assume that the expected revenue is strictly concave in the expected 
demand so that the optimization problem ( 10 . 3 ) has a unique optimal d. 

Theorem 10 . 3.2 Under the assumption that the expected revenue is strictly con- 
cave in the expected demand , the optimal selling price for the additive demand case 
equals the optimal selling price for the deterministic demand case, which, on the 
other hand, is no more than the optimal selling price for the multiplicative demand 
case. 

Proof. It suffices to prove that there exist (y d , d*f), (t/*, d*), and (t/^, d^J opti- 
mal for problem ( 10 . 3 ) with deterministic, additive, and multiplicative demand, 
respectively, such that d* = d d > d^. 

First, notice that 

<f(y, d) = R(d) — cd — E[c(y — ad — / 3 ) + h(y — ad — /?)]. 

For the deterministic demand case with a = 1 and /3 = 0 , since 0 is a minimizer 
for the function cx + h(x), it is optimal to set a selling price such that the realized 
demand is d d , which solves 

max R{d) — cd , 

dE[d,d\ 

and to order the demand exactly, that is , y d = d d . 

Now we prove that there exists an optimal solution (7/*,d*) for problem ( 10 . 3 ) 
with additive demand such that d* = d* d . If d* < d^, then (y* + 77, d* + 77) gives an 
objective value no less than that given by (y^,d* a ) for a sufficiently small positive 
r\. If d* > d^, we distinguish between two cases. First, y* > 0. In this case, 
(y* — 77, d* — 77) gives an objective value no less than that given by (y* , d* ) for 
a sufficiently small positive 77. Second, y* = 0 . In this case, ( 0 ,d* — 77) gives an 
objective value no less than that given by (7/*, d*) for a sufficiently small positive 77, 
since 0 is a minimizer of the function cx-\-h{x). Therefore, there exists an optimal 
solution (y* , d* ) for problem ( 10 . 3 ) with additive demand such that d* = d* d . 

Finally, we argue that there exists an optimal solution (7/^,d^) for problem 
( 10 . 3 ) with multiplicative demand such that d ^ < d d . Assume that d ^ > d d . 
Again, we distinguish between two cases. First, y ^ > 0 . In this case, Lemma 10 . 3.1 
implies that (7/^ — 77, d ^ — 77) gives an objective value no less than that given by 
(t/Ju, dfn) for a sufficiently small positive 77. Second, 77^ = 0. Similar to the argument 
for the additive demand case, (0, d ^ — 77) gives an objective value no less than that 
given by (y^d^) for a sufficiently small positive 77, since 0 is a minimizer of the 
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function cx+h{x). Therefore, there exists an optimal solution (y^, d^) for problem 
(10.3) with multiplicative demand such that d^ < d*. I 

The above theorem thus implies that there is a significant difference between 
the additive demand case and the multiplicative demand case. To understand 
this difference, notice that the variance of the additive demand is independent of 
the selling price, while the variance of the multiplicative demand is a decreasing 
function of the selling price. Thus, for the multiplicative demand case, the retailer 
tends to choose a higher selling price so as to decrease the variability of the demand. 

In the above discussion, we assume a zero initial inventory level and zero fixed 
ordering cost. Now let x be the initial inventory level, let y be the target stock 
level, and also assume that the fixed ordering cost is K. In this case, we face the 
following problem: 



max _ —K5(y — x) + <f(y, d ) + cx, 

y>x,d£[d,d\ 



(10.4) 



where S(u) = 1 for u > 0 and (5(0) = 0. 

In the following, we will show that a simple policy, referred to as the (s, S', p) 
policy, is optimal for problem (10.4). In such a policy, the inventory is managed 
based on an (s, S ) policy and the optimal price p(x) is a function of the initial 
inventory level x. Moreover, for the special case with zero fixed ordering cost, 
a base stock list price policy is optimal: The inventory is managed based on a 
base stock policy, and the optimal price is a nonincreasing function of the initial 
inventory level. 

Theorem 10.3.3 For problem (10.4), an (s, S, p) policy is optimal. Furthermore, 
for the special case with zero fixed ordering cost, a base stock list price policy is 
optimal. 



Proof. First, notice that Theorem 2.2.6 implies that the function f>(x,d) is super- 
modular. Thus, from Theorem 2.2.8, there exists a nondecreasing function d{x) 
such that d(x) maximizes <f(x, d) for any given inventory level x. 

Now let S be a maximizer of the function cj)(x, d(x)) + cx and let s satisfy 

0(s, d(s)) T cs = 0(S, d(S)) FcS-K. 

Since (j)(x,d ) is jointly concave in (x,d), <f(x,d(x)) is a concave function. This 
allows one to show that the optimal inventory level is managed based on the (s, S) 
policy. Moreover, the optimal price is a function of the initial inventory level: If 
x is no more than s , the optimal price is Z) _1 (<i(S)); if x is greater than s , the 
optimal price is the D~ 1 (d(x)). 

Finally, for the special case with zero ordering cost, we have s = S and the 
optimal selling price is H _1 (d(max(S', x))). Hence, a base stock list price policy is 
optimal. I 
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10.4 Finite-Horizon Models 

10.4-1 Model Description 

In this section, we focus on a finite-horizon model. Unlike the single-period models, 
the structure of the optimal policies are significantly different between the additive 
demand case and the multiplicative demand case, as we will demonstrate in this 
section. 

Consider a firm that has to make replenishment and pricing decisions over a 
finite time horizon with T periods. 

Demands in different periods are independent of each other. For each period t, 
t = 1 , 2 . . . , T, let d t be the demand and p t be the selling price in period t. We 
assume that d t = a t D t (p t ) + /?t, which is time-dependent and satisfies Assump- 
tion 10.2.2. Notice that in this section, the random perturbation e, the demand 
function D(p, e), and the expected revenue function R(d) are indexed by t to de- 
note time dependence. The selling price p t is restricted in an interval. In particular, 
let p and p t be the lower and upper bounds of the selling price p t , respectively. 

Let x t be the inventory level at the beginning of period £, just before placing an 
order. Similarly, y t is the inventory level at the beginning of period t after placing 
an order. The ordering cost function includes both a fixed cost and a variable cost 
and is calculated for every £, t = 1 , 2 ,..., as 

K 5 (y t - x t ) + ct(yt ~ x t ). 

Lead time is assumed to be zero, and hence an order placed at the beginning of 
period t arrives immediately before demand for the period is realized. 

Unsatisfied demand is backlogged. Let x be the inventory level carried over 
from period t to the next period. Since we allow backlogging, x may be positive 
or negative. A cost h t (x) is incurred at the end of period £, which represents the 
inventory holding cost when x > 0 and the shortage cost if x < 0. 

Given a discount factor 7 with 0 < 7 < 1, an initial inventory level, x\ — x , and 
a pricing and replenishment policy, let 

T 

Vpx) = '^ 2 'j t ~ 1 {-K 5 (y t -x t ) - ct(y t - x t ) - h t {x t+1 ) + p t D t (pt,e t )) ( 10 . 5 ) 

t= 1 

be the T-period total discounted profit for a realization of the random perturba- 
tions e u where x t +i = Vt ~ Dt(Pt , d)- 

The objective is to decide on ordering and pricing policies to maximize the total 
expected discounted profit over the entire planning horizon. That is, the objective 
is to maximize 

E[V£(x)] (10.6) 

for any initial inventory level x and any 0 < 7 < 1. 

To find the optimal strategy that maximizes ( 10 . 6 ), let v t (x) be the maximum 
total expected discounted profit when T — t periods remain in the planning horizon 
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and the inventory level at the beginning of period t is x. A natural dynamic 
program that can be applied to find the policy maximizing (10.6) is as follows. For 
£ = 1, 2, . . . , T, 



V t (x) = c t x+ max -K5(y - x) + g t {y,p) (10-7) 

y>x,p t >p>p t 



with vt+i(x) = 0 for any x, where 

9t(y,P ) := -c t y + E\pD t (p,e t ) - h t (y - D t (p,e t )) + 7^+1 (v ~ D t (p,e t ))]. 

Observe that the single-period profit function 

-c t (y -x) + E{pD t (p, e t ) - h t {y - D t {p, et))] 

is not necessarily a concave function of the selling price p, since D t (p, et) may be 
a nonlinear function of p. Fortunately, for the general demand functions (10.1), 
we can represent the formulation (10.7) only with respect to the expected demand 
rather than with respect to the price, which allows us to show that the single- 
period profit function is jointly concave in terms of the inventory level and expected 
demand. Let 

dt = D t{Pt) and d t = D t (p t ). 

Assumption 10.2.1 implies that there is a one-to-one correspondence between the 
selling price p t G \p t ,Pt\ and the expected demand D t (p t ) E [d t , d t \. 

Denote the expected demand at period t by d = D t (p). Also, let 

M x ) = v t {x)-ctx,h]{y) = ht(y)+(c t -'yc t+ i)y, and R t (d) = R t (d)-c t d, (10.8) 

where c^+i = 0 and R t is the expected revenue function with 

R t (d) = dDp(d), 

which, by Assumption 10.2.1, is a concave function of expected demand d. These 
functions, 0t(x), hj(y), and Rt(d ), allow us to transform the original problem into 
a problem with zero variable ordering cost. 

Specifically, the dynamic program (10.7) can be written as 

4>t(x)=ma x-KS(y - x) + f t (y) (10.9) 

y>x 

with 0 T+ i(x) = 0 for any x, where 

f t {y) = ma *d t >d>d t 9t{y , d), (10.10) 

with 

9t{y,d ) = H?(y,d) + ^yE[(j) t+1 (y - a t d - /3 t )] (10.11) 

and 

H?(y, d) := —E[h](y - a t d - &)] + R t {d). 
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Thus, most of our focus is on the transformed problem (10.9), which has a 
similar structure to problem (10.7). In this transformed problem, one can think 
of Kl as being the holding and shortage cost function, R t as being the revenue 
function, and the variable ordering cost is equal to zero. 

For technical reasons, we need the following assumption on the revenue functions 
and on the holding and shortage cost functions. 

Assumption 10.4.1 Fort = 1,2,.. v h t is convex and Hf(y,d) is well defined 
for any y and d G [d t ,d t ]. Therefore , Hf(y,d) is jointly concave in y and d and 
consequently, 

Qj{x) := max Hf(x,d) (10.12) 

d t >d>d t 

is concave. Furthermore, we assume that for any t, 

lim Qj (x) = — oc. 

| x | — ^oo 

Notice that one can think of Hf (y, d) as being the expected single-period profit 
excluding the ordering cost for a given inventory level y and a selling price associ- 
ated with a given expected demand, and Q 7 (x) as being the maximum expected 
single-period profit excluding the ordering cost for a given inventory level x by 
choosing the best selling price. 



10. f .2 Symmetric K -Convex Functions 

To motivate the technique used for characterizing the optimal policies for the 
integrated inventory and pricing models, it is useful to relate our problem to the 
celebrated stochastic inventory control problem discussed in Chap. 9. In that prob- 
lem, demand is assumed to be exogenously determined, while here demand depends 
on price. Other assumptions regarding the framework of the model are similar to 
those made in Chap. 9. In order to prove that an (s, S ) policy is optimal for the 
stochastic inventory models, we employed the concept of iF-convexity. It is clear 
from Definition 9.3.1 that one significant difference between iF-convexity and tra- 
ditional convexity is that (9.5) is not symmetric with respect to xo and x\, and 
thus it cannot be trivially extended to multidimensional space. 

It turns out that this asymmetry is the main barrier when trying to identify the 
optimal policy to the integrated inventory and pricing problem with nonadditive 
demand functions. Indeed, there exist counterexamples that show that the function 
ft is not necessarily A-concave and an (s, S ) inventory policy is not necessarily 
optimal for the finite- horizon model with multiplicative demand functions. This 
motivates the development of a new concept, the symmetric K- concave function, 
which allows us to characterize the optimal policy in the general demand case. 

Definition 10.4.2 A function f : » 5ft is called symmetric K -convex for 

K > 0 if, for any x 0 ,xi G 3? n and A G [0, 1], 
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/(( 1 — A)^o + Aaq) < (1 — X)f(xo) + Xf(xi) + max{A, 1 — A }if. (10.13) 
A function f is called symmetric if -concave if —f is symmetric if -convex. 

Observe that similar to the concept of convexity, the symmetric if -convexity 
is defined in a multidimensional space while the if -convexity is only defined in 
one-dimensional space. Moreover, a if-convex function is a symmetric if-convex 
function. The following results describe properties of symmetric if-convex func- 
tions, properties that are parallel to those summarized in Lemma 9.3.2 and Propo- 
sition 9.3.3. 

Lemma 10.4.3 (a) A real-valued convex function is also symmetric 0-convex 

and hence symmetric if -convex for all if >0. A symmetric K\- convex func- 
tion is also a symmetric K^- convex function for K\ < if 2 . 

(b) If fi(y) and /^(y) are symmetric Ki-convex and symmetric K 2 ~convex, re- 
spectively, then for a, /3 > 0, afi(y) + Pf 2 (y) is symmetric {aK\ + PK 2 )- 
convex. 

(c) If f(y) is symmetric K-convex and £ is a random variable, then E[f(y — ()\ 
is also symmetric K-convex, provided E[\f(y — £)|] <00 for all y. 

(d) Assume that f : 5ft 5ft is a continuous symmetric K-convex function and 
f(y) 00 as \y\ — >> 00 . Let S be a global minimizer of f and s be any element 
from the set 

X := {x\x < S, f{x) = f(S) + if and f(x') > f(x) for any x' < x}. 
Then we have the following results: 

(i) f(s) = f(S) + if and f(y ) > f(s) for all y < s. 

(ii) f(y) < f(z) + if for all y, z with (s + S)/2 < y < z. 



Proof. Parts (a), (b), and (c) follow directly from the definition of symmetric if- 
convexity. Hence, we focus on part (d). Since / is continuous and f(y) — >• 00 as 
| y | — 00, X is not empty. Part (d)(i) is a direct consequence of the fact that sGl. 

To prove part (d) (ii), we consider two cases. First, for any y, z with S < y < z, 
there exists A G [0,1] such that y = (1 — X)S + Xz, and we have from the definition 
of symmetric if -convexity that 

m < (1 - A )/(S) + A f(z) + max{A, 1 - A}if < f(z) + if, 

where the second inequality follows from the fact that S minimizes f(x). 

In the second case, consider y such that S >y > (s + S)/ 2. In this case, there 
exists 1 > A > 1/2 such that y = (1 — A)s + A 5, and from the definition of 
symmetric if -convexity, we have that 
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m < (1 - A )f(s) + A f(S) + A K = f(S ) + K< f(z ) + K 

since f(s) = f(S) + K. Hence, (i) and (ii) hold. I 

Figure 10.1 provides an illustration of the property of a symmetric iF-convex 
function in Lemma 10.4.3 part (d). Notice that there might exist a set A C (s, (s + 
S)/ 2) such that f(x) > f(S) + K for x G A. 




FIGURE 10.1. Illustration of the properties of a symmetric iGconvex function 



We now present another important property of symmetric iF-convex functions, 
which allows us to prove the symmetric iU-concavity of the functions (f) t (x ) and 
gt(y, d) by dynamic programming backward induction. 

Proposition 10.4.4 If f : 5ft — )> 5R is a symmetric K- convex function, then the 
function 

(f){x) = min QS(x - y) + f(y) 

y<x 

is symmetric max{iU, Q}- convex. Similarly , the function 

fj(x) = min Q5(x — y) + f{y) 

y>x 

is also symmetric max{iF, Q}- convex. 



Proof. We only need to prove the symmetric ma x{iF, Q}-convexity of function 
4>{x). The second part of the result follows from the symmetric property of the 
symmetric K- convexity. 

If Q > iU, we know that f{x) is also a symmetric Q-convex function by Lemma 
10.4.3 part (a). Hence, it suffices to prove that in the case K > Q, the symmet- 
ric iF-convexity of the function f(x) implies the symmetric iF-convexity of the 
function (j)(x). Thus, in the remaining part of the proof, we assume that K > Q. 

Observe that (j){x) < f{x) for any x and <f(x) < Q + f(y) for any y < x. Let 
E = {x | <f(x) = f(x)} and R = {x \ <f(x) < f(x)}. We want to show that for any 
Xq,xi and A G [0, 1] with xq < Xi, 
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(j)(x a) < (1 — X)(j)(xo) + X(/)(xi) + max{A, 1 — X}K , (10.14) 

where x\ = (1 — X)xo + Aaq. We will consider four different cases. 

Case 1: x$,x% G E. In this case, 

4>(x\) < fix a) 

< (1 — X)f(xo) + Xf(xi) + max{A, 1 — A }K 
= (1 — X)(j>{xo) + Xcj){xi) + max{A, 1 — A }IT, 

where the second inequality follows from the symmetric iG-convexity of the func- 
tion f(x). 

Case 2: xq,xi G R. In this case, let <j>{xi) = Q + f(yi) for i = 0,1, with yi < x^ 
and let y\ = (1 — A)yo + At/i. It is clear that yo < yi and y\ < x\. Furthermore, 

<t>{x\) < Q + f{y a) 

< (1 — A)(Q + f{y o)) + A(Q + f(yi)) + max{A, 1 — 

= (1 — A)0(#o) + A0(^i) + max{A, 1 — A} if, 

where the second inequality follows from the symmetric if-convexity of the func- 
tion f(x). 

Case 3: x$ G i?, x\ G E. Let </>(xo) = Q + f(yo) with t/q < ^o- We will distinguish 
between two cases. 

Subcase 1: f(yo) — f(x i) < K — Q. In this case, 

<l>(xx) < Q + f(yo) 

= (1 — X)(Q + f(y 0 )) + Xf(xi) + A (Q + /( 2 / 0 ) ~ f( x 1 )) 

< (1 — A)0(xo) + A0(xi) + A K. 

Subcase 2: f(yo) — f{x 1 ) > K — Q. Let x\ = (1 — /i)?/o + H x i with A < /jl. Then 

&( x a) < /(^a) 

< (1 - aO/G/o) + /h/Oi) + max{g, 1 - fi}K 
= (1 — A)0(xo) + A0(xi) + max{g, 1 — fi}K 
+ (M - x )Cf( x i) - /( 2 / 0 )) - (1 - A)Q 

< (1 — A)0(x o ) + A0(xi) + max{g, 1 — fi}K — (1 — y)Q — (/i — A)if 

< (1 — A)0(#o) + A0(xi) + max{A, 1 — A}if, 

where the second inequality follows from the symmetric K- convexity of the func- 
tion /(x), and the third inequality follows from the assumption that f(yo) — 
f(xi) > K — Q. 

Case 4: xq G E,x\ G R. Let (j){x 1 ) = Q + f(yi) for y 1 < x\. Again, we distinguish 
between two different cases. 
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Subcase 1: y\ <x\. In this case, 

<t>( x a) < Q + f(yi) 

= (1 - A)/(*o) + HQ + fivi)) + (1 - A )(Q + f( yi ) - /(*„)) 

< (1 — X)tfr(xo) + X<j>(xi) + (1 — X)Q, 

where the last inequality holds since f(yi) < f{pc o). 

Subcase 2: y\ > x\. Let x\ = (1 — fi)x o + yyi with X < y. Then 
<«za) < f(x a) 

< (! - m)/(*o) + M/(yi) + rnax{/x, 1 - n n , ^ 

= (1 — A)^>(xo) + X<j)(xi) + max{/u, 1 — /i}K ' ' 

+ (m - A)(/(yi) - f(xo)) - XQ, 

where the second inequality follows from the symmetric iF-convexity of the func- 
tion f(x). On the other hand, since Xq <x\, 

< Q + f(x o) 

= (1 — A)0(x o ) + A0(xi) (10.16) 

+ X(f(xo) — f(yi)) + (1 — X)Q. 

If /i < then inequality (10.15) implies inequality (10.14) since 

max{/i, 1 — fi} = l — fi<l — X and f(y{) < f(x 0 ). 

Now assume that /x > |. Multiplying (10.15) by A//i and (10.16) by (/x — A )//i 
and adding them together, we have 

4>{xx) < (1 - \)4>{xo) + A0(®i) + AX — (- - (1 - A))Q. (10.17) 

fl 

If A then X — (i_ A) > 0, which, together with inequality (10.17), implies 
(10.14). On the other hand, if A < we have that 

XK — — (1 — A ))Q = (1 - X)K + (2A - 1 )(K - Q) + AQ(1 - 1 ) < (1 - A )K, 

(1 fl 

which, together with inequality (10.17), implies (10.14). I 

In the following, we show that, like convex functions, the symmetric iF-convexity 
can be preserved under optimization operations. 

Lemma 10 . 4.5 Let /(•, •) : 5R n x 5ft be symmetric K -convex. Assume that 

for a given x G 5ft n , there is an associated set C(x) C 5ft m and 

c := {(x,y) I y G C{x),x G 5R n } 

is convex. Furthermore , assume that 

4>(x)= min f(x,y) 
yec(x) 



is well defined and the minimization is attainable for any x. Then f is symmetric 
K -convex. 
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Proof. For any Xo,Xi G and A G [0, 1], let yo G C(x o) and yi G C(x i) such that 
(t>{x o) = f(x 0 ,y 0 ) and (f>{xi) = f(xi,y{). Then 

(1 — A)yo + Aj/i G C(( 1 — A)a;o + Axi) 

and 

</>((l - A)x 0 + Axi) < /((l - X)x 0 + Axi, (1 - X)y 0 + Ayi) 

< (1 - A)/(x 0 , yo ) + Xf(x 1 ,y 1 ) + max{A, 1 - A }K 

= (1 — \)<f>(xo) + A0(xi) + max{A, 1 — A }K. 

Therefore, / is symmetric iT-convex. I 

In the following, we focus on characterizing the optimal solution for the finite- 
horizon model. Specifically, our objective is to identify pricing and replenishment 
policies that solve (10.7) or its equivalent (10.9). 

However, under the additive demand model, this concept is not needed. Indeed, 
we prove in Sect. 10.4.3 that, for additive demand functions, the function <fi t is 
K- concave and hence the optimal policy for problem (10.9) is an (s, S', p) policy. 
Formally, in this policy, every period, £, the inventory policy is characterized by 
two parameters, the reorder point, and the order-up-to level, S t . An order of 
size St — x t is made at the beginning of period t if the initial inventory level at the 
beginning of the period, x t , is smaller than s t . Otherwise, no order is placed. The 
selling price in period £, p t , is a function of the inventory level after an order was 
made. 

It turns out that for the additive demand model, the optimal policy and the 
analysis are significantly different from the optimal policy for the general demand 
case. In fact, in this case, the symmetric K- convexity is not needed. Specifically, we 
show, in Sect. 10.4.3, that when the demand function is additive, the function (j) t 
is K- concave for any £, and hence an (s, 5, p) policy is optimal for problem (10.9). 
Formally, in this policy, every period, £, the inventory policy is characterized by 
two parameters, the reorder point, s t , and the order-up-to level, St- An order of 
size St — x t is made at the beginning of period t if the initial inventory level at the 
beginning of the period, x t , is smaller than s t - Otherwise, no order is placed. The 
selling price in period £, p t , is a function of the inventory level after an order was 
made. 

For more general demand functions, that is, multiplicative plus additive func- 
tions, the function <fi t is not necessarily IT-concave and an (s, S', p) policy is not 
necessarily optimal. Indeed, in this case, we show, in Sect. 10.4.4, that c/) t is sym- 
metric iT-concave, which allows us to characterize the optimal policy for the gen- 
eral demand model. Finally, in Sect. 10.4.5, we show that our results imply that 
in the special case with zero fixed cost and general demand functions, a base stock 
list price policy is optimal. 

10.4-3 Additive Demand Functions 

In the additive demand model, the demand function is assumed to be of the form 

d t = D t (p t ) +/? t , 



where /3 t is a random variable. 
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Observe that a special case of this demand function is the additive linear demand 
function in which d t = b t — a t pt + A, with b t ,a t >0 for t = 1, 2, . . . , T. 

In the following, we show, by induction, that f t (y ) is a IF-concave function of 
y and is a IT-concave function of x. Therefore, the optimality of an ( s , S, p) 

policy follows directly from Lemma 9.3.2. 

To prove that f t (y ) is a IT-concave function of y , we need the following lemma. 

Lemma 10.4.6 Assume that r is concave on a bounded interval [d, d\ and w : 5ft — >• 
5ft is continuous. Then there exists an optimal solution d{x) of the optimization 
problem 

f(x) = max r(d) + w(x — d) (10.18) 

c£e 

such that x — d{x) is nondecreasing in x. If w is K -concave, then f is also K- 
concave. 

Proof. Define a new variable d = x — d. We have that 

f(x) = max _ r(x — d) + w(d). 
x—dE [d,d] 

Our assumption, together with Theorem 2.2.6, implies that the objective function 
of the above optimization problem is supermodular in (x,d). It is also easy to 
verify that the constraint set is a lattice. Thus, from Theorem 2.2.8, there exists 
an optimal solution d(x) that is nondecreasing in x. The first part of the lemma 
follows since d(x) = x — d{x) is optimal to problem (10.18). 

To prove the second part of the lemma, take any x, x' with x < x' and A E [0, 1]. 
Since / is IF-concave and d(x) is nondecreasing, we have that 

/(( 1 — X)x + Xx') > r(( 1 — A )d{x) + A d(x')) + w(( 1 — X)d(x) + A d(x')) 

> (1 — A )r{d{x)) + A r(d(x f )) + (1 — A )w(d(x)) 

+A w(d(x')) — A k 
= (1 - A )f{x) + A f(x') - X k, 

where the first inequality and the equality follow from the definition of d(-) and 
d(-), and the second inequality from the concavity of r(d) and IT-concavity of w(x) 
as well as the monotonicity of d{x). Thus, / is IF-concave. I 

We are now ready to prove our main results for the additive demand model. 

Theorem 10.4.7 (a) For any t = 1,2, ...,T ; g t (y,d) is jointly continuous in 

(y,d), and hence for any fixed y , g t (y,d) has a finite maximizer d t (y) such 
that y — dt(y) is nondecreasing in y. Furthermore, 

lim gt(y, d) = — oo for any d E [d t , d t \ uniformly . 

lyhoo 



(b) For any t — 1,2 , ... ,T , ft(y) and <j>t{x) are K -concave. 
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(c) For any t = 1,2, there exist s t and S t with s t < S t such that it 

is optimal to order St — x t and set the selling price pt(x t ) — D^ 1 (d t (St)) 
when the initial inventory level x t < s t , and not to order anything and set 
Pt(xt) = when x t > St- 



Proof. We prove by induction. For period T, part (a) follows directly from As- 
sumption 10.4.1. Parts (b) and (c) hold since fr(y) is concave. 

Assume that parts (a), (b), and (c) hold for t + 1 . Since gt+i(y,d) is contin- 
uous, f t +i(y) = max de[ d t> j t] ^ + i( 2 /,d), and hence, </) t +i(x) = ma x{f t (x),K + 
min y > x ft(y)} are also continuous, which implies that g t (y,d) is continuous in 
(y, d) as well. Note that gt(y, d) can be expressed as the sum of a concave function 
of d and a IT-concave function of y — d since (j) t +\{x) is iT-concave. Therefore, from 
Lemma 10.4.6, ft{y) is IT-concave, and from Proposition 9.3.3, <f>t is IT-concave as 
well. Thus, part (b) holds for period t. 

In addition, we have again by Lemma 10.4.6 that for any fixed y , gt(y, d) has a 
finite maximizer d t (y) such that y — d t (y) is nondecreasing in y. The optimality of 
the (st + i,S£+i) policy implies that 



E[(j)t+i(y — d — p t )\ < 0t+i(*St+i) 

for any (y, d), and hence, lim^i^^ gt(y, d) = — oo for any d G [d t , d t \ uniformly by 
Assumption 10.4.1. Therefore, part (a) holds for period t. 

We now prove part (c). Since f t (y ) is IT-concave, Lemma 9.3.2 part (d) implies 
that there exist s t and St such that St maximizes f t (y ) and s t is the smallest value 
of y such that f t (S t ) = ft(y ) + K, and 



M x ) 



- K + f t (S t ) Xx<3 U 

ft{x ) if x > s t . 



Hence, part (c) holds. 

Finally, the optimality of the price function pt(x t ) follows from the definition of 

d t (y). ■ 

Thus, Theorem 10.4.7 implies that an (s, S', p) policy is optimal when the de- 
mand is additive. In addition, there exists an optimal solution d t {y) maximizing 
(10.10) such that y — d t (y) is a nondecreasing function of y. That is, the higher the 
initial inventory level at the beginning of time period £, y t , the higher the expected 
inventory level at the end of period £, y t — d t (yt )• 

An interesting question is whether a list price policy is optimal, as is the case for 
the single-period model with no fixed cost. Unfortunately, this property does not 
hold for the finite- horizon model, as illustrated by Chen and Simchi-Levi (2004a). 
Indeed, although there is incentive to lower the selling price to reduce inventory, 
there is also incentive to raise the price in order to slow the depletion of inventory 
and delay the incurring of fixed ordering cost. 
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10.4-4 General Demand Functions 

In this section, we focus on the model with general demand functions (10.1). Ob- 
serve that the additive demand function analyzed in the previous section is a spe- 
cial case of the general demand function (10.1). More importantly, multiplicative 
demand functions of the form d t = a t D t (p ), where D t (p ) = a t p~ bt ( a t > 0,b t > 1), 
or demand functions of the form d t = Pt + OL t {b t — a t p) ( a t > 0, b t > 0) are also 
special cases. 

To characterize the optimal policy for the model with the demand functions 
(10.1), one might consider using the same approach applied in Sect. 10.4.3. Unfor- 
tunately, in this case, the function y — atdt(y ) is not necessarily a nondecreasing 
function of y for all possible a t , as is the case for additive demand functions. Hence, 
the approach employed in Sect. 10.4.3 does not work in this case. In fact, as demon- 
strated in Chen and Simchi-Levi (2004a), the function g t (y , d t (y)) and 4>t{x) are in 
general not iC-concave, and an (s, S', p) policy is not necessarily optimal. 

To overcome these difficulties, we apply the concept of symmetric iC-convexity 
introduced in Sect. 10.4.2. Specifically, in the following, we show, by induction, 
that g t (y , d) is a symmetric iF-concave function of (?/, d) and f>t{x) is a symmetric 
iC-concave function of x. Hence, a characterization of the optimal pricing and 
ordering policies follows from Lemma 10.4.3. 

Theorem 10.4.8 (a) For any t, g t (y,d) is continuous in (y,d), and hence for 

any fixed y, g t (y , d) has a finite maximizer d t (y). Furthermore , 

lim g t (y , d) = — oo for any d E [d t , dt] uniformly . 

|y |— >-oo 

(b) For any t = 1, 2, . . * , T, g t (y,d) and <j>t{x) are symmetric K -concave. 

(c) For any t = 1, 2, . . . ,T ; there exist St and St with s t < St and a set A t C 
[st, (s t Ft)/ 2] such that it is optimal to order St — x t and set the selling price 
p t = (d t (St)) when the initial inventory level x t < s t or x t E A t , and 
not to order anything and set p t = Df 1 {d t {x t )) otherwise. 

Proof. The proof of part (a) is similar to that of part (a) in Theorem 10.4.7. We 
now focus on part (b). 

By induction, </>t+ i(x) = 0 is symmetric 0-concave. From the symmetric K- 
concavity of i(x), we have that E[c/)t+i(y — a t d — f3 t )\ is symmetric iF-concave. 
Also, we have that F[f(y,d) is concave by Assumption 10.4.1. Hence, g t (y,d) is 
symmetric yiF-concave, and hence by Lemma 10.4.5, the function f t (y ) is symmet- 
ric yiF-concave. Finally, ft{y) is symmetric iF-concave by Lemma 10.4.3 part (a), 
and the symmetric iT-concavity of (j>t{x) follows from Proposition 10.4.4. Thus, 
part (b) holds. 

We now prove part (c). From Lemma 10.4.3 part (d), we have 

/ -K + ft(St) if xelt, 

\ ft{x ) if X <£ It, 



M x ) = 
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where S t is the maximizer of f t {y ) and 

It = {y < St I My) < ft(St) - K}. 

Furthermore, (j>t{x) > ft{x) for any x and (j>t{x) > — K + ft {St) for any x < St- 
Let s t be defined as the smallest value of y such that ft {St) = ft{y) + K- Note 
that from Lemma 10.4.3 part (d), (— oo ,s t \ C It and [{s t + St)/ 2, oo) C (it) c , the 
complement of I t . Part (c) follows from Lemma 10.4.3 and part (b) by defining 

At = It H [st, (st + St)/ 2]. 

Again, the optimality of the price function pt{x t ) follows from the definition of 

<k{v). ■ 

Theorem 10.4.8 thus implies that the optimal policy for problem (10.7) is an 
(s, S', A,p) policy. Such a policy is characterized by two parameters, st and St, 
and a set A t C [s t , {st + S*)/2], possibly empty. When the inventory level xt at 
the beginning of the period t is less than s t or x t is in the set A t , an order of 
size S t — x t is made. Otherwise, no order is placed. Thus, it is possible that an 
order will be placed when the inventory level x t G [s*, {s t + St)/ 2], depending on 
the problem instance. In any case, if an order is placed, it is always to raise the 
inventory level to St- 

10.4-5 Special Case: Zero Fixed Ordering Cost 

We now apply our results to the zero-fixed-cost case. 

Corollary 10.4.9 Consider our model with zero fixed ordering cost and general 
demand functions (10.1). In this case, a base stock list price policy is optimal. 

Proof. By Theorem 10.4.8, the functions <ft{x) and f t {y ), t = 1,2 , ...,T, are 
symmetric 0-concave and hence, from Definition 10.4.2, concave. The optimality 
of the base stock inventory policy follows directly from the concavity of ft{y) for 
£ = 1, 2, . . . , T. 

We now show that d t {y) is nondecreasing, and therefore the optimal price pt{y) is 
nonincreasing. In fact, in the zero fixed ordering cost case, g t {y , d) can be expressed 
as r{d)-\-E[w{y — ad)] for some concave function w, and thus Theorem 2.2.6 implies 
that g t {y,d) is supermodular. From Theorem 2.2.8, there exists d t {y), which is 
nondecreasing. I 



10.5 Alternative Approach to the Optimality of (s, S, p) 
Policies 

The analysis in the previous section builds upon the concept of K - convexity and 
its extension, symmetric IT-convexity. In this section, we provide an alternative 



10.5 Alternative Approach to the Optimality of (s, S, p) Policies 195 

approach developed by Huh and Janakiraman (2008) to prove the optimality of 
(s, S, p) policies for the model in Sect. 10.4. The approach is essentially an exten- 
sion of the one in Sect. 9.4 for stochastic inventory models to integrated inventory 
and pricing models. Similar to Sect. 9.4, we focus on stationary systems. That is, 
the inventory holding and backorder cost function h t and parameters q, d t , and 
d t are all time-independent, and the random variables (a t , /3 t ) (t = 1, 2 . . . , T) are 
iid across time. Thus, we drop the subscript t of d t , d t , and H t 7 in the dynamic 
program (10.9)-(10.10) in this section. 

We assume that H^(y,d) is continuous in (y, d) and lim|, E |_^ 00 Q 1 {x) = — oo, 
where Q^(x) = max de ^j] H 7 (x,d). With these assumptions, the functions <fit and 
ft are continuous, and lim|, E |_ ) , 00 4>t{x) = lim| cc |_ ) . 00 ft(x) = — oo. Thus, the related 
optimal solutions exist in the dynamic program (10.9)-(10.10). Let y° be a global 
maximizer of Q 7 (x). We make the following assumption. 

Assumption 10.5.1 (a) For any y 1 and y 2 with y 2 < y 1 < y° and d 2 G [d,d\, 

there exists a d 1 G [d, d\ such that 

H^(y\ d 1 ) > H 1 [y 2 , d 2 ) and y 1 - ad 1 - f3 > y 2 - ad 2 - f3 
for any realization of (a,/3). 

(b) For any y 1 and y 2 with y° < y 1 < y 2 and d 2 G [d, d\, there exists a d 1 G [d, d\ 
such that 

{y 1 , d 1 ) > H 7 (y 2 , d 2 ) and y 1 — ad 1 — (3 < ma x{?/ 2 — ad 2 — /3, y 0 } 
for any realization of (a,/?). 

Observe that given (y, d) and the realization of (a, /3) at a period, y — ad — 
/3, represents the initial inventory level of the next period. Assumption 10.5.1 
indicates that for any (t/ 2 , d 2 ), if y l is closer to y° — the inventory level that attains 
the highest single-period expected profit — than y 2 , we can always find an expected 
demand level d 1 (correspondingly a selling price) such that ( 2/ 1 , d 1 ) incurs a higher 
single-period expected profit than ( y 2 ,d 2 ). This implies that Q 7 is quasiconcave. 
In addition, the initial inventory level of the next period resulting from ( y 1 ,d 1 ) is 
closer to y° than that from (t/ 2 , d 2 ), or one can raise the inventory level y 1 —ad 1 — /3 
to y° by placing an order in the case with y° < y 1 < y 2 . The approach in this 
section is applicable for demand models more general than the linear one presented 
here, for which we refer to Huh and Janakiraman (2008). 

We now present conditions under which Assumption 10.5.1 is valid. Recall the 
definitions of KJ and R in (10.8). 

Proposition 10.5.2 Assume additive demand, namely, a = l. If E[hJ(x — /3)\ is 
quasiconvex and R{d) is quasiconcave, then Assumption 10.5.1 holds. 

Proof. Let d° be a global maximizer of R over [d, d] and let x° be a global 
minimizer of E[hJ{x — /?)]. Since id 7 (7/,d) = R{d) — E[hJ (y — d — /?)], y° = d°+x° 
is a global maximizer of Q 1 . 
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First, consider the case with y 2 < y 1 <y°. For any fixed d 2 G [d, d], define 



d 1 = minKy 1 - y 2 ) + d 2 , d°}. 



If d 1 = ( y 1 — y 2 ) + c? 2 , we have that 

y 1 - d 1 - p = y 2 - d 2 - 0, d 2 < d 1 < d° 

and 

H^(y\ d 1 ) = Rid 1 ) - E[h] (y 1 - d 1 - 0)] 

> R{d 2 ) - E[hJ (y 2 -d 2 - 0)} = fT (y 2 , d 2 ), 

where the inequality follows from the quasiconcavity of R. 

If d 1 = d°, we have that d° < ( y 1 — y 2 ) + d 2 . Hence, 

y° - d° - p > y 1 - d 1 - p > y 2 - d 2 - 0, 



which implies that 

d 1 ) = Rid 1 ) - E[h] iy 1 - d 1 - /?)] 

> R(d 2 ) ~ E[h]iy 2 -d 2 - 0)] = H 1 iy 2 , d 2 ), 

where the inequality follows from the definition of d° and the quasiconvexity of 
E[h](x — /?)]. Thus, Assumption 10.5.1 part (a) holds. 

For the case with y 2 < y 1 < t/°, define 

d 1 = max{(t/ 1 - y 2 ) + d 2 , d 0 }. 

Assumption 10.5.1 part (b) follows from a similar argument. I 

Lemma 10.5.3 Under Assumption 10.5.1, for any y 2 < y\ < y° , 

My l) > Mvt ); 



for y° < y\ < y 2 , 

Mvl) > ft(y 2 t)-iK. 

Proof. At period t , we compare two systems with initial inventory levels y\ and y 2 , 
referred to as systems y 1 and y 2 , respectively. Assume that system y 2 follows an 
optimal strategy that attains /t(^/ 2 ), the expected total discounted profit, for the 
remaining T — t + 1 periods if we act optimally in the remaining T — t + 1 periods 
except that no order is placed at period t. For a given sample path of the system, 
namely, a realization of system uncertainties, denote (x 2 ,y 2 ,d 2 ) as the inventory 
level before placing the order, the inventory level after receiving the order, and 
the expected demand level of period l of system y 2 following the optimal strategy, 
respectively. We will construct a feasible strategy for system y 1 and compare its 
expected total discounted profit with /t(^/ 2 ). Let (xj , y ] , dj) be the inventory level 
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before placing the order, the inventory level after receiving the order, and the 
expected demand level of period l of system y l along this sample path under this 
strategy, respectively. Note that x\ +1 = y\ — aid \ — fii for a realization (a/, /3i) and 
i = 1,2. 

We now describe the process of constructing a strategy for system y 1 . For the 
case with y 2 < y\ < y°, according to Assumption 10.5.1 part (a), we can pick a 
d\ £ [d, d] such that 

# 7 (+ + > # 7 (+ d t ) and x t + 1 < x t +\ < y °- (10.19) 

For the case with y° <y\ < y 2 , according to Assumption 10.5.1 part (b), we can 
select ad} G [d, d] such that 

H1 ( yl > d t ) > # 7 (y?+) and +1 ^ max (a;?+i>y 0 }- 

In either case, we end up with two possibilities: 

(1) 4+1 < +1 or ( 2 ) 2/t+i < +1 < y°- 

In the first case, for system 2 / 1 , place an order to raise the inventory level from x] +1 
to y% +1 at period t + 1 , use d| +1 , and follow the strategy of system y 2 thereafter. 
In the second case, for system y 1 , order nothing at period t + 1. Thus, t /^ +1 = x\ +1 
and yt + i < 2/£ +1 < y 0 . We then repeat the process at period t+1 and later periods 
if necessary until systems y 1 and y 2 end up with the same inventory level or we 
reach the end of the planning horizon. 

From the description of the process, for y 2 < y\ < y° , it is clear that at any 
period l with l > £, the realized profit of system y 1 is always no less than that of 
system y 2 . Note that when x\ +1 < y^ +1 , system y 2 must have placed an order at 
period t + 1 . Since this is true along any sample path, we have that ft(y\) ^ ft(Vt )• 
The case with y° < y\ < y 2 is similar except that when x\ +1 < y| +1 , system 
y 1 needs to place an order at period t + 1, while system y 2 may not. Therefore, 
fM)>ft{y 2 t )-iK. I 

The above lemma allows us to show the optimality of (s, S', p) policy under 
Assumption 10.5.1. 

Theorem 10.5.4 Consider the finite-horizon model described in Sect. 10. 4 . If 
the system is stationary and Assumption 10.5.1 holds , then an (s, S, p) policy is 
optimal. 



Proof. Let St be a global maximizer of f t with St > y°. Its existence is guaranteed 
by our assumptions on H 7 and Lemma 10.5.3. Let s t = mm{x\f t (x) = f t (S t ) — K}. 
Since f t is continuous and f t (x) = — 00 , s t is well defined. In addition, 

St < y° since from Lemma 10.5.3, f t (y ) > ft (St) — 7 K for any y with S t > y > y°. 

We show that the (s*, St) inventory policy is optimal. To see this, note that for 
any y\ < s t , ft(Vt) ^ ft(st) = ft (St) ~ K- Thus, it is optimal to place an order 
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to raise the inventory level to St- On the other hand, for any inventory level y\ 
above s t , it is optimal not to place an order. In fact, if y\ E [s t ,y°], 

Mvl) > Mat) = ft(St) - K, 

and if y\ > y°, from Lemma 10.5.3, 

Mvl) > MVt ) --(K > f t (y 2 ) - K V y° < y\ < y 2 u 

which implies that it is better off not to place an order. I 

Interestingly, using a similar approach but under a less restrictive assumption, 
we can show that a stationary (s, 5, p) policy is optimal for a corresponding 
infinite-horizon model with stationary parameters and general demand functions 
under the discounted profit criterion. 

Assumption 10.5.5 (a) For any y 1 and y 2 with y 2 < y 1 < y° and d 2 E [d, d] ; 

there exists d 1 E [d, d\ such that 

H^{y\d l )>H\y\d 2 ). 

(b) The same as Assumption 10.5.1 part (b). 

Again, the above assumption implies that Q 1 is quasiconcave. 

Proposition 10.5.6 If H 7 (y, d) is quasiconcave in (y, d), then Assumption 10.5.5 
holds. 

Proof. Let d° be the global maximizer of !L 7 (7/°,d) for d E [d,d\. For any y 1 
and y 2 with y 2 < y 1 < y° or y° < y 1 < y 2 and d 2 , let A E [0,1] such that 
y 1 = (1 — A )y° + A y 2 . Define 

d 1 = (1 — A)d° + Ad 2 . 

From the quasiconcavity of Id 7 , we have that 

H\y\ d 1 ) > min {H\y° , d°), H^(y 2 , d 2 )} = H\y 2 , d 2 ). 

For y° < y 1 < y 2 and any realization of (a, /3), 

y 1 — ad 1 — (3 = (1 — A ){y° — ad° — (3) + A (y 2 — ad 2 — (3) 

< max{|/ 0 — ad° — /3, y 2 — ad 2 — /}} 

< max{|/ 0 , y 2 — ad 2 — /3}. 

Thus, Assumption 10.5.5 holds. I 

When R is concave and h 1 is convex, H 7 is concave and thus quasiconcave. It 
would be interesting to identify other conditions under which H 1 is quasiconcave 
or Assumption 10.5.5 holds. 
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Lemma 10.5.7 



Under Assumption 10.5.5, for any y 2 < y\ < y° or y° < y\ < y 2 , 



My l) > Mvt)-iK. 



Proof. The proof is similar to that for Lemma 10.5.3. The only exception is that 
for the case with y 2 < y\ < y°, we may no longer claim x 2 +1 < x\ +1 in (10.19) 
under Assumption 10.5.5. In this case, we cannot exclude the possibility that 
system y 1 places an order at period £ + 1 to raise the inventory level from x\ +1 
to y?> i while system y 2 does not order at period t + 1. Thus, for y+ < y} < y°, 
MVt) > ft(Vt) ~ 1 K instead of f t (yj) > / t (y t 2 ). I 

Theorem 10.5.8 Consider the infinite-horizon counterpart of the model described 
in Sect. 10.4 with stationary parameters and 7 E (0, 1). If Assumption 10.5.5 holds, 
then a stationary ( s , S, p) policy is optimal. 

Proof. For the infinite-horizon counterpart of the model described in Sect. 10.4 
with stationary parameters and 7 E (0, 1), we can show that f t is well defined and 
continuous and lim | a ,|_^ 00 f t (x ) = — 00 . In addition, it is independent of £, and thus 
we drop the subscript of f t in the proof. 

Let S' be a global maximizer of / and s = max{x| f{pc) = f(S) — K,x < 
minis',?/ 0 }}. We show that it is optimal to follow the (s, S) policy. The defini- 
tion of s implies that for any x E (s, min{S, ?/ 0 }], it is optimal not to order. From 
Lemma 10.5.7, for any given x with x > min{S, y 0 }, if 
S < x < y°, 

f(x) > f(S) - A K> f(S) - K; 
and if x > y°, then for any y > x, 

f(x) > f(y) - A K> f(y) - K. 

Thus, for any x > s, it is optimal not to order. 

It remains to prove that for any x < s, it is optimal to place an order to raise 
the inventory level to S. First, observe that at the inventory level s , the expected 
profit generated by the strategy of not ordering now but ordering up to S at the 
next period is given by Q 7 (s) + 7 (f(S) — K), which by definition is no more than 
f(s) = f(S ) - K. Therefore, 

Q^(s) < (1 - 7 )(f(S) - K). 

Assume that it is not optimal to order for some x < s. Start with the initial 
inventory level x at any period, without loss of generality, at the first period. 
Given an optimal strategy, let r be the first time that an order is placed for a 
realization of the uncertainties. Clearly, r is a stopping time and we have that 

x — X\ > X 2 • • • > x T , 

where x t denotes the inventory level at the beginning of period t. At period t < r, 
no order is placed and the profit is no more than Thus, for the realization 
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of the uncertainties, the total discounted profit from periods 1 to r — 1 is no more 
than 

t<T 

Since Q 1 is quasiconcave, Q 1 {x t ) < Q 1 {s ) for t < r and the above profit is no 
more than 

^7 t - 1 C 7 (s) < (f(S) - K)(l - 7 r - 1 ). 

t<T 

At period r, it is optimal to raise the inventory level to 5, since S' is a global 
maximizer of / and x < S. Conditioning on r, the net present value of the total 
expected discounted profit from periods r to T = oo is (1 — 7 ) r-1 (/(S) — if). 
Therefore, starting with x, the total expected discounted profit over the infinite 
planning horizon is no more than f(S) — if. However, this profit can be obtained 
by placing an order at period 1 to raise the inventory level to S. Thus, for any 
x < s, it is optimal to place an order to raise the inventory level to S. I 



10.6 Extensions and Challenges 

In Sect. 10.4, we show by employing the classic concept of if -convexity that an 
(s, S, p) policy is optimal for the additive demand case. By using a weaker concept 
of symmetric if -convexity, we show that an (s, S, A, p) policy is optimal for the 
general demand case. Theorem 10.5.8 in Sect. 10.5 shows that a stationary (s, S, p) 
policy is optimal for the infinite-horizon counterpart with stationary parameters 
and general demand functions under the discounted profit criterion. Based on 
the concept of symmetric if -concavity, Chen and Simchi-Levi (2004b) provide a 
unified proof for the optimality of a stationary (s,S, p) policy for the infinite- 
horizon model under either the discounted profit or the average profit criterion. 
Table 10.1 is a summary of structural results of the inventory (and pricing) models. 



TABLE 10.1. Summary of results for the inventory (and pricing) problems 





Inventory model 


Joint inventory and pricing model 


No fixed 
cost 


Base stock 
policy 


Base stock list price policy 


Fixed 

ordering 

cost 


(s,S) Policy 


Finite- horizon case 


Infinite- 

horizon 

case 


Additive 

demand 


General 

demand 


(s,5,p) 

Policy 


Policy 


(s,S, p) 
Policy 



Of course, it is appropriate to point out that many of our results in this chap- 
ter may not hold for problems with discrete prices (see Chen 2003). Indeed, if 
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price is restricted to take values from a discrete set, even the single-period profit 
function may not be concave, and our analysis no longer works. This fact imposes 
a significant challenge for solving the integrated inventory and pricing models, 
since in order to solve these models, one usually discretizes inventory levels and 
discrete prices. Thus, a natural question is whether one can design efficient algo- 
rithms by employing the structural results of optimal policy identified in previous 
subsections. 

Another challenge for the integrated inventory and pricing models analyzed in 
this section is the zero-lead-time assumption. This is not the case for the standard 
inventory control problems with backlogging. In fact, for the standard stochastic 
inventory models, the structural results of the optimal policies can be generally 
extended to models with deterministic lead time, as we pointed out at the begin- 
ning of Sect. 9.6. The idea is to transfer a model with a positive lead time to one 
with a similar structure while with zero lead time. However, this technique is not 
valid here, since for our models with a positive lead time, the two decisions — the 
ordering decision and the pricing decision — will take effect at different times. Yet, 
as demonstrated by Pang et al. (2012) , the concept of L^-convexity can still be 
useful, and the results in Sect. 9.6 can be extended to the integrated inventory and 
pricing models with positive lead times and backlogging. Specifically, they show 
that the optimal ordering quantity decreases in the on-hand inventory and the 
inventory in transit and is more sensitive to newer outstanding orders than older 
outstanding orders and the on- hand inventory with bounded sensitivities. The op- 
timal price decreases in the on-hand inventory and the inventory in transit as well. 
However, it is more sensitive to older outstanding orders and the on-hand inven- 
tory than newer outstanding orders. More recently, Chen et al. (2012a) extend the 
models and results to perishable products with finite lifetimes. 

In this chapter, we restrict our effort to backlogging models. Lost sales models 
are much more complicated to deal with. In this case, even in the single-period 
model, the expected revenue pE[mm(x, D(p, e))] as a function of p and x may 
not be well behaved. Many papers in the literature focus on the existence and 
uniqueness of the optimal solutions, the concavity or quasiconcavity of the ex- 
pected profit functions, and comparative statics analysis. We refer to Chen and 
Simchi-Levi (2012) for references. 

Finally, a few recent papers analyze integrated inventory and pricing models with 
price adjustment cost (Chen, Zhou and Chen 2011 and Chen and Hu 2012) and 
models in which demand depends on not only the current price but also previous 
prices (Chen et al. 2012c). Again, we refer to Chen and Simchi-Levi (2012) for a 
survey of these new developments. 



10.7 Risk- Averse Inventory Models 

All the inventory (and pricing) models discussed so far focus on risk-neutral deci- 
sion makers, that is, inventory managers who are insensitive to profit variations. 
Evidently, not all inventory managers are risk-neutral; many planners are willing 
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to trade off lower expected profit for downside protection against possible losses. 
Indeed, experimental evidence suggests that for some products, the so-called high- 
profit products, decision makers are risk-averse; see Schweitzer and Chachon (2000) 
for more details. Unfortunately, traditional inventory control models fall short of 
meeting the needs of risk-averse planners. For instance, traditional inventory mod- 
els do not suggest mechanisms to reduce the chance of unfavorable profit levels. 
Thus, it is important to incorporate the notions of risk aversion in a broad class 
of inventory models. 

The literature on risk-averse inventory models is quite limited and mainly focuses 
on single-period problems or is based on mean-variance tradeoffs. For instance, 
Lau (1980) analyzes the classical newsvendor model, in which he maximizes the 
decision maker’s expected utility of total profit or the probability of achieving 
a certain level of profit. Eeckhoudt, Gollier and Schlesinger (1995) focus on the 
impact of risk and risk aversion in the newsvendor model when risk is measured 
by expected utility functions. 

Chen and Federgruen (2000) analyze the mean-variance tradeoffs in newsven- 
dor models as well as some standard infinite- horizon inventory models. Specifically, 
in the infinite-horizon models, Chen and Federgruen focus on the mean-variance 
tradeoff of customer waiting time as well as the mean-variance tradeoffs of inven- 
tory levels. Martfnez-de-Albeniz and Simchi-Levi (2006) study the mean-variance 
tradeoffs faced by a manufacturer signing a portfolio of option contracts with its 
suppliers and having access to a spot market. 

Assuming a linear ordering cost, Bourakiz and Sobel (1992) minimize the ex- 
pected exponential utility of the present value of costs over a finite planning horizon 
or an infinite horizon. In particular, they show that a base stock policy is optimal. 

So far, all the papers referenced above assume that demand is exogenous. A rare 
exception is Agrawal and Seshadri ( 2000 ) who consider a risk-averse retailer that 
has to decide on its ordering quantity and selling price for a single period. They 
demonstrate that different assumptions on the demand-price function may lead 
to different properties of the selling price. 

In this section, we discuss a general framework for incorporating risk aversion 
in multiperiod inventory (and pricing) models, in which risk is measured based 
on increasing and concave utility functions. Our analysis is based on Chen, Sim, 
Simchi-Levi and Sun (2007). 

The assumptions made in the risk-averse models are similar to those in the joint 
inventory and pricing models analyzed in Sect. 10.4. One exception is that demand 
is a linear function of the selling price; that is, D t (p ) is a linear function of p. More 
importantly, the objective of the risk-averse decision maker is to maximize the 
expected utility of the total discounted profit over the planning horizon. That is, 
the objective is to maximize 



E[u(VPx))} 



( 10 . 20 ) 



for any initial inventory level x and any given 0 < 7 < 1 , where u(-) is a utility 
function and V^(x)) is defined in (10.5). 
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We require the utility function, u(x), to be increasing so that more is always 
preferred over less. Of course, if u(x) is a linear and increasing function, the model 
(10.20) yields the same optimal solution as the risk- neutral model of (10.6). We 
also assume that the utility function is concave so that the marginal satisfaction 
of gaining a dollar is never more than the marginal loss of satisfaction associated 
with losing the same amount of money. It is appropriate to point out that expected 
utility theory is widely used in microeconomics and finance literature. 

In the next subsection, we discuss the risk-averse framework based on a general 
increasing and concave utility function. This is followed by a subsection on models 
based on an important special case, the exponential utility. 

10.7.1 Expected Utility Risk- Averse Models 

Unlike the risk-neutral models analyzed in Sect. 10.4, the objective function (10.20) 
in its current form appears not to be decomposable and is not amenable to the 
dynamic programming approach. To deal with this issue, we introduce a new 
variable w to denote the wealth accumulated from the beginning of the planning 
horizon up to the current period. Thus, the state of the problem at period t can 
now be modeled as the inventory level x t and the accumulated wealth from period 
T to period £, w t . 

Consider the expected utility measure. Let W t (x,w) be the maximum utility 
achievable starting at the beginning of period t with an initial inventory level x 
and an accumulated wealth w. The dynamic program can be written as follows. 
Let 

Wt+i(x,w) = u(w), 

and for t = 1,2, ... ,T, 

W t (x,w) = max E[W t + i(x+,w + )\, ( 10 . 21 ) 

y>x,p t >p>p t 



where 

X+ = y - Dt(p,e t ) 

and 

w + = w +'j t ~ 1 (-K5(y -x)- Ct(y - x) + pD t (p, e t ) - h t (y - D t (p,e t )). ( 10 . 22 ) 

We would like to emphasize that in this section, D t (p , e t ) is linear in p. Also notice 
that here we assume, without loss of generality, that Wt+i(x,w) is independent 
of x, which implies zero salvage value. Finally, we have 

ma x E[u(Vj! (x))] = Wi(x,0). 

Instead of working with the dynamic program ( 10 . 21 ), we find that it is more 
convenient to work with an equivalent formulation. Let 

U t (x, w ) = W t (x, w - 7 1-1 c t x). 
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The dynamic program (10.21) becomes 

U t (x,w) = max E[Ut+i(x+,w+)], (10.23) 

y>x,p t >p>p t 

where 

w+ = w +'j t ~ 1 (-K6{y - x) + ft{y,p,e t )) 

and 

ft(y,P,£t) = -( ct — 7 c t+ i)y + (p — jct+i)D t (p, e t ) - h t {y - D t (p,e t )). (10.24) 

We have the following observation, which can be easily verified by induction. 

Lemma 10.7.1 For any period t and fixed x, U t (x,w) is increasing in w. 

Interestingly, this observation allows us to show that a wealth-dependent base 
stock inventory policy is optimal when there is zero fixed ordering cost. 

Theorem 10.7.2 Assume that K = 0. In this case, Ut(x,w ) is jointly concave in 
x and w for any period t. Furthermore, a wealth- dependent base stock inventory 
policy is optimal for the risk-averse inventory (and pricing) problem (10.20). 

Proof. We prove by induction. Obviously, Ut+i(x,w) is jointly concave in x and 
w. Assume that Ut+i(x,w) is jointly concave in x and w. We now prove that a 
wealth-dependent base stock inventory policy is optimal and U t (x,w) is jointly 
concave in x and w. 

First, notice that for any realization of e t , f t is jointly concave in (y,p), which 
implies that w+ is jointly concave in (w,x,y,p). 

Since x+ is a linear function of (y,p) and w+ is jointly concave in (w,x,y,p), 
Lemma 10.7.1 allows us to show that Ui+i(x+,w+) is jointly concave in (w, x, y,p). 
This implies that E[Ut+i(x+,w+)\ is jointly concave in (w,x,y,p). 

We now prove that a re-dependent base stock inventory policy is optimal. Let 
y*(w) be an optimal solution for the problem 

max < max E[Ut+i(x^w+)\ >. 
y I pt>p>p t l 

Since E[U t +i(x+,w+)] is concave in y for any fixed w, it is optimal to order up 
to y*(w) when x < y*(w) and not to order otherwise. In other words, a state- 
dependent base stock inventory policy is optimal. 

Finally, according to Proposition 2.1.15, U t (x,w) is jointly concave. I 

Recall that in the case of a risk- neutral decision maker, a base stock list price 
policy is optimal. Theorem 10.7.2 thus implies that in the case of an increasing 
concave utility risk measure, the optimal policy is quite different. Indeed, in these 
cases, the base stock level depends on the total profit accumulated from the be- 
ginning of the planning horizon, and it is not clear whether a list price policy is 
optimal. 

Stronger results exist for models based on the exponential utility risk measure, 
as is demonstrated in the next subsection. 
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10.7.2 Exponential Utility Risk-Averse Models 

We now focus on exponential utility functions of the form u(w) = 6(1 — exp(— w/b)) 
with parameter b > 0. The beauty of exponential utility functions is that we can 
essentially separate x and w as is illustrated in the next theorem. 

Theorem 10.7.3 For any time period t, there exists a function GRx) such that 

U t {x , w) = u(w + 7 t_ 1 G t (x)). 



Proof. We prove by induction. For t = T + 1 , Gt+i(x) = 0 for any x. Assume that 
there exists a function Gt+i(x) such that 

U t +i(x,w) =u(w + j t G t+ i(x)). 

From the recursion (10.21), we have that 

U t (x,w) = ma Xy>x,p t >p>p t bE[ 1 - exp(-(w+ +j t G t +i(y - D t (p,e t ))/b)] 

= b- bexp(-wfb) min y >x,p t > P >p t 
exp^*- 1 /b(K5(y - x) - L t (y,p)/b )) 

= u(w + 7 t- 1 G t (x)), 



where 



and 



t{y,p) = -b/ 7* 1 In (-E[exp(- 7 * e t ) + 'yGt+liy ~ D t (p, e t 



G t {x) = max —K5(y - x) + L t (y,p). 

y>x,p t >p>p + 



(10.25) 



Thus, the result is true. I 

The theorem thus implies that the optimal policy is independent of the accu- 
mulated wealth when exponential utility functions are used, which significantly 
simplifies the problem. In fact, the optimal policy can be found by solving prob- 
lem (10.25). Furthermore, this theorem, together with Theorem 10.7.2, implies 
that when there is zero fixed ordering cost, a base stock inventory policy is op- 
timal under the exponential utility risk criterion independent of whether or not 
price is a decision variable. 

Before we present our main result for the problem with K > 0, recall the famous 
Holder inequality. 

Theorem 10.7.4 Assume p^q > 0 with l/p+l/q = 1. If f and g are continuous 
functions on 5ft with J ^ \f(x)\ p dx < oc and \g(x)\ q d(x) < oo, then 



\f{x)g{x)\dx < \f{x)\ p d2j (^ \f{x)\ q dAj 



\ 1/9 
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An important corollary of the Holder inequality is as follows. 

Theorem 10.7.5 If a function f is convex, K -convex, or symmetric K -convex, 
then the function 

g{x ) = ln(E[exp(f(x - £))]) 

is also convex, K -convex, or symmetric K -convex, respectively. 



Proof. We only prove the case with IT-convexity; the other two cases can be proven 
by following similar steps. 

Define M(x) = E[ex.p(f(x — £))]. It suffices to prove that for any xq,x\ with 
Xq < x\ and any A G [0, 1], 

M(x a) < M(x 0 ) 1 ~ x M(x 1 ) x exp(XK), 
where x\ = (1 — A)xo + Xx\. Notice that 

M (x\) < £[exp((l - A )f(x 0 - 0 + A f(xi - £) + Aif)] 

= exp(AiT)£’[exp((l - \)f(x 0 - £)) exp(A/(xi - 0)] 

< exp ( XK ) E [exp (/ (x 0 - C))} 1 ~ X E[exp(f(xi - C))] A 
= M(xo) 1 ~ x M(xi) x exp(XK), 

where the first inequality holds since / is K-convex and the second inequality 
follows from the Holder inequality with 1/p = 1 — A and 1/q = X. I 

We can now present the optimal policy for the risk-averse multiperiod inventory 
(and pricing) problem with exponential utility function. 

Theorem 10.7.6 (a) If price is not a decision variable (i.e., p t = p t for each t), 

G t {x) and L t (y,p) are K -concave and an (s,S) inventory policy is optimal. 

(b) If price is a decision variable, Gt(x ) and L t (y,p) are symmetric K -concave 
and an ( s , S , A, p) policy is optimal. 

Proof. We only provide a sketch of the proof; the complete proof is left as an ex- 
ercise. The main idea of the proof is as follows: If G t +i(x) is K- concave when 
price is not a decision variable (or symmetric IT-concave when price is a de- 
cision variable), then, by Theorem 10.7.5, L t (y,p) is IT-concave (or symmetric 
iT-concave). The remaining parts follow directly from Lemma 9.3.2 and Proposi- 
tion 9.3.3 for IT-concavity (or Lemma 10.4.3 and Proposition 10.4.4 for symmetric 
K- concavity). I 

We observe the similarities and differences between the optimal policy under the 
exponential utility measure and the one under the risk-neutral case. Indeed, when 
demand is exogenous, that is, price is not a decision variable, an ( s,S ) inventory 
policy is optimal for the risk-neutral case; see Theorem 9.3.4. Theorem 10.7.6 
implies that this is also true under the exponential utility measure. Similarly, for 
the more general inventory and pricing problem, Theorem 10.4.8 implies that an 
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TABLE 10.2. Summary of results for finite-horizon risk-neutral and risk-averse models 





Price not a decision 


Price is a decision 


if = 0 


K> 0 


if = 0 


K > 0 


Risk-neutral 

model 


Base stock 


(s,S) 


Base stock 
List price 


(s,S,A, p) 


Exponential 

utility 


Base stock 


(s,S) 


Base stock 


(s,S,A, p) 


Increasing & 
concave utility 


Wealth- dependent 
Base stock 


? 


Wealth- dependent 
Base stock 


? 



(s, 5, A , p) policy is optimal for the risk-neutral case. Interestingly, this policy is 
also optimal for the exponential utility case. 

Of course, the results for the risk-neutral case are a bit stronger. Indeed, if 
demand is additive, Theorem 10.4.7 suggests that an (s,S, p) policy is optimal. 
Unfortunately, it is not clear whether this result still holds for the risk- averse 
inventory and pricing problem under exponential risk measure. 

The structural results of the optimal policies for the finite-horizon risk-averse 
models as well as risk- neutral models are summarized in Table 10.2. For infinite- 
horizon models with exponential utility and fixed ordering cost, Chen and Sun 
(2012) prove that like the infinite- horizon risk-neutral model, a stationary (s, S, p) 
is optimal. 



10.8 Exercises 

Exercise 10.1. Prove Theorem 10.7.5 by Exercise 2.4. 

Exercise 10.2. Complete the proof of Theorem 10.7.6. 

Exercise 10.3. Recall the single-period model analyzed in Sect. 10.3. We modify 
the model as follows. Instead of placing an emergency order to satisfy shortages, 
we assume that unsatisfied demand is lost. In this case, h(x) is the penalty cost 
for lost sales if x < 0. Show that the optimal selling for the additive demand case 
is no more than that for the deterministic demand case, which in turn is no more 
than that for the multiplicative demand case. 

Exercise 10.4. Building on the concept of symmetric if-convexity, Ye and 
Duenyas (2007) introduce the concept of (if, Q)- convexity. A real- valued func- 
tion / is called (if, Q )- convex for if , Q > 0 if, for any xq,x\ with xq < x\ and 
A G [0, 1], 

f((l-\)x 0 +\xi) < (l-A)/(x 0 )+A/(xi)-fAif+(l-A)Q-min{A, 1— A} min{if, Q}. 

It is easy to see that (if, 0)-convexity is exactly the if -convexity and the (if, if )- 
convexity is the symmetric if-convexity. Prove the following. 
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(a) A (if, Q )- convex function is also (K',Q ')~ convex for if < if' and Q < Q' . 
A real- valued convex function is (0, 0)-convex and hence ( if , Q)-convex for 
all if, Q > 0. 

(b) If gi(y) and g 2 {y) are (ifi, Qi)-convex and (if 2 , Q 2 )-convex, respectively, 

and (ifi - Qi)(K 2 - Q 2 ) > 0, then for a,/3 > 0, agi(y) + Pg 2 (y) is (<aifi + 
/3if2, + /3Q2)-convex. 

(c) If #(?/) is (if, Q)- convex and re is a random variable, then E{g(y — re)} is 
also (if, Q)-convex, provided E{\g(y — tc)|} < oc for all y. 

(d) Assume that g is a continuous (if, Q)-convex function with if > Q and 
g(y) oo as \y\ oo. Define 

S = min{ x | g(x) < g(y ), for any y}, 

s = min{ x \ g{pc) = g(S) + if}, 
s' = sup{x | x < S,g(x') > g(S) + (if — Q) for any x' < x}, 

and 

= inf{x | x > S,g{x') > g(S) + Q for all x' > x}. 

Then s < s' < S < u, and we have the following results. 

(i) g(s) = g(S) + if and g(y) > g(s) for all y < s. 

(ii) g{u) = g{S) + Q and g(y) > g{u) for all y > u. 

(iii) g(y) < g(z ) + Q for all y , £ with z < y < s' . 

(iv) g(y) < g(z ) + if for all y, 2 ? with s' < y < z. 

(v) <?(?/) < < 7 ( 2 ) + if for all y, 2 ? with (s + 5)/2 < y < z. 

Exercise 10.5. (Chen and Simchi-Levi 2009) Given a (if, Q)-convex function /, 
prove that the function 



g(x) = min QS(y - x) + f(y ) 

y<x 

is also (if, Q)-convex, where 5(x) = 1 for x > 0 and 5(x) =0 otherwise. Similarly, 

h(x) = min if 5(?/ — x) + f(y ) 

y>x 



is also (if, Q)-convex. 

Exercise 10.6. (Chen and Simchi-Levi 2009) Assume that / : 5ft — )> 5ft is (if, Q)- 
convex. Prove that there exists a convex function f(x) such that 



f(x) < f{pc) < f(x) + maxjif, Q}, for any x. 
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Exercise 10.7. (Chen, Zhang and Zhou 2010) A function / is quasi- lf-concave 
with changeover a if it is increasing on (— oo,a] and non- if -increasing on [a, oo). 
Prove the following statement: If r(-) is a differentiable concave function and w{-) is 
a continuously differentiable quasi- IC-concave function with some finite changeover 
£°, then the function /(•) defined by f{pc) = max d€ [ d) j] r(d) -f w(x — d ) is quasi- K- 
concave with a finite changeover no less than £°. Is the differentiability assumption 
dispensable? 



Part III 



Competition, Coordination 
and Design Models 



11 

Supply Chain Competition 
and Collaboration Models 



In this chapter, we analyze decentralized supply chain systems with independent 
retailers, each of which — facing uncertain demand — needs to decide its stock level 
and selling price in a single period. In Sect. 11.1, the retailers compete on prices for 
which noncooperative game theory is appropriate. In Sect. 11.2, the retailers do not 
compete and have incentives to form coalitions that place joint orders and share 
inventory due to risk-pooling effects and economies of scale. The model and analy- 
sis in Sects. 11.1 and 11.2 are based on Bernstein and Federgruen (2004) and Chen 
(2009), respectively. Our intention is to provide a snapshot of the applications of 
game theory to supply chain management. For surveys on this topic, see Cachon 
and Netessine (2004) and Nagarajan and Sosic (2008). 



11.1 Inventory and Pricing Competition 

Consider a system with n independent retailers. Let N = {1, 2, . . . , n} denote the 
set of retailers. Each retailer faces a single-period problem similar to the one in 
Sect. 10.3. Specifically, retailer i (i G TV), facing demand uncertainty, has to decide 
on its stock level yi and a selling price pi G [p^Pi] of a single product before the 
realization of the demand uncertainty. Demand is filled as much as possible from 
the on-hand inventory, and unsatisfied demand is filled with an emergency order. 
For retailer i, let q be the unit ordering cost, hp the unit emergency ordering cost, 
and hi the unit inventory holding/disposal cost if hi is nonnegative or the unit 
salvage value if it is negative. Assume that 



D. Simchi-Levi et al., The Logic of Logistics: Theory, Algorithms, and Applications 213 

for Logistics Management , Springer Series in Operations Research and Financial Engineering, 
DOI 10.1007/978-1-4614-9149-1—11, © Springer Science+Business Media New York 2014 



214 



11. Supply Chain Competition and Collaboration Models 



h i > Ci > max{ 0 , — hf }. 

As we pointed out in Sect. 10.3, this implies that the salvage value is no more than 
the normal unit ordering cost, which in turn is no more than the unit cost for the 
emergency order. To avoid trivial cases, we also assume that p i >h~. 

The demand of retailer i (i £ TV), Di(p,cti), is a deterministic function of the 
prices of all retailers, p = (pi, . . . ,p n ), times a nonnegative random noise with 
a continuous cdf That is, 



Di(p, a,) = di(p)cti. 



Without loss of generality, assume E[ai] = 1 . The expected demand d*(p) = 
E[Di(p,ai)] (i £ N) is assumed to be differentiable in p. We also assume 



ddj(p) 

dpi 



ddj(p) 

dpj 



> 0 V j 7 ^ i. 



The first inequality indicates that a higher selling price of a retailer leads to a 
lower demand of itself, which is reasonable under most circumstances. The second 
inequality implies that a higher selling price of a retailer leads to higher demands 
of other retailers, which implies that the products offered by the retailers are 
substitutable. One additional assumption is imposed on di(p). 

Assumption 11.1.1 Fori G N, di{p ) is log-supermodular ; that is, log di(p) is 
supermodular. 



Several plausible demand functions satisfy these assumptions: 



• the linear demand 



di(p) = b-t - dijPj ( bi > 0, an > 0, and < 0 V i, j £ N,i ^ j); 

jeN 



• the exponential demand 

di(p) = e bi ~^^ N aiiPi ( an > 0 and < 0 Vi, j £ N,i ^ j); 



• the Cobb-Douglas demand 

di(p) = biU jeN pj aij {bi > 0, an > 1, and < 0 V i, j G TV, i ^ j); 

• the constant elasticity of substitution demand 

d i (p) = n -F^- (^>0,r>0); 

zljeNPj 
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• the Logit demand 

p-CliPi 

d (p) = ° l + y . eJV e-<w > o v* e N )- 

Given the vector of stock levels y — (yi, . . . , y n ) and the price vector p, retailer 
V s strategy set is Si = [0, oo) x [p.,pi\ and its expected profit is given by 

Vi(y,p) =Pidi(p)-Ciyi-hfE[max(yi - di(p)ati, 0) - hf E[max(di(p)oLi - 0)] 

= Pidiip) - h 

where 

ti(y) = K - (K - c i)y + ( K + K)E[m.ax(y - oti, 0 )]. 

Since a retailer’s expected profit depends on other retailers’ decisions, the system 
can be modeled as a noncooperative game (TV, Assume that all 

the costs, demand functions, and the structure of the game are common knowledge. 
The concept of Nash equilibrium is a natural predication of the outcome of the 
system. 

Since a retailer’s stocking decision has no impact on other retailers, its best 
stock level in response to the price vector p can be easily derived as the optimal 
ordering quantity of a newsvendor problem. Specifically, retailer V s optimal stock 
level is given by 

Viip) = di(p)F~ 1 (p i ), 

where F~ 1 (p i ) is a solution of = 0 and pi = . With this stock level, 

i ' i 

retailer i’s reduced expected profit is now a function of the price vector only: 



7T*(p) = di(p)(pi-£i (ft 1 (p i ))). 

We end up with a reduced game (TV, {S'ijieiv, {tt i}ieN ), where Si = \p^Pi\. Since 
7 Ti(p) can be easily shown to be log-supermodular by Assumption 11.1.1, the game 
ieN^ {^i}ieN) belongs to the class of log-supermodular games and thus 
shares the properties of supermodular games in Theorem 3.1.4. 

Theorem 11.1.2 Under Assumption 11.1.1, the set of Nash equilibria of the red- 
uced game (TV, {^i}ieN) is nonempty and has a largest and a smallest 

element. In addition, the largest and smallest Nash equilibria are increasing in 
(H,hf, and hf (i E N). 



Proof. It remains to prove the second part. From Theorem 3.1.4, it suffices to 
show that 7 u(p) has increasing differences in ( Pi,w>i ) for any fixed p~i, where 
Wi is Ci,h\, or hf. Given the formulation of tt ( p), we only need to prove that 



dci 



de i (F~ 1 ( Pi )) 

dhf 



and 



ae^F-^pi)) 

dh~ 



are nonnegative. 
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Since F i 1 (pi) is a solution of tf^y) = 0, we have that 



a ‘‘ = (1 +-E[mi«(Fr 1 (ft) 

+em , 3F r 1 (ft) 

+ H\y>\ y =F- i ( Pi )) dh ~ 

= (i - FpiPi)) + £ l ma x(r; _ 1 (pi) - a», o)] 

> 0 . 



Similarly, 



and 



Mi (Fr'faj) 

dci 



= F~\ Pi )> 0 



d£j (Fr 1 ^)) 

dh+ 



= E[max(F i 1 ( y o i ) — a*,!))] > 0. 



Thus, the largest and smallest Nash equilibria are increasing in a, hf , and h~ . I 
Since the expected profit of a retailer is nondecreasing in the other retailers’ 
prices, the largest Nash equilibrium is preferable by all retailers. Indeed, let p* 
be a Nash equilibrium and p * be the largest Nash equilibrium. We have that for 
i e N, 

Ki{P*) > 7T i(p*,P-i ) > 



where the first inequality follows from the definition of Nash equilibrium and the 
second one holds since 7Ti(p) is nondecreasing in pj (j ^ i). 

For any equilibrium, p*, of the reduced game, it is clear that (y(p*),p*) is an equi- 
librium of the game (N,{Si} ieN ,{vi} ieN ), where y(p) = (y 1 (p),..., 

y n (p))- If would be interesting to see how y(p*) changes with Ci,hf,hp when 
p * is either the largest or the smallest equilibrium. 

Finally, from (3.1) in Chap. 3, a sufficient condition for the uniqueness of a Nash 
equilibrium is the diagonally dominant condition: 



<9 2 log7 n{p) a 2 log 7 n(p) 

d 2 Pi “ dpidpj 



In the exercise, you are asked to provide conditions on the demand functions listed 
earlier under which the diagonally dominant condition holds. 



11.2 Inventory Centralization Games 

In recent years, many companies have started exploring innovative collaboration 
strategies in an effort to improve their supply chain efficiency and ultimately the 
bottom line. Firms are employing strategies such as forming long-term alliances 
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and building collaborative logistics to reduce their supply chain costs. There are 
numerous examples of collaboration in supply chains. For instance, Good Neigh- 
bor Pharmacy is a retailers’ cooperative network of 2, 700 independently owned 
and operated pharmacies, and Affiliated Foods Midwest supplies more than 850 
independent food-retailer members in 12 midwestern U.S. states with a full line of 
grocery products. 

Indeed, to compete with big box retailers, it is common for independent gro- 
cery stores, hardware stores, and pharmacies to form retailers’ cooperative groups, 
business entities that employ economies of scale on behalf of retailer-members to 
get discounts from manufacturers and to pool marketing. To join a retailers’ coop- 
erative, a store would typically pay a membership fee and purchase certain stock 
in the cooperative in return for its voting share. In addition, a store is usually 
required to purchase a minimum amount of inventory from the cooperative. The 
operating profits of the cooperative are returned to the member stores in cash 
or stock rebate (see Stankevich 1996). Over the years, retail cooperative groups 
have developed a variety of popular groupwide programs, such as insurance, pen- 
sion plans, inventory management, pricing assistance, logistics, warehousing, store 
design and layout, site selection, and employee training (see Ghosh 1994). 

These innovative strategies raise a variety of important and challenging ques- 
tions on managing supply chains. For instance, for a group of companies in a supply 
chain, how should they cooperate, what possible outcomes can be achieved, and 
how do the players share the costs and benefits? Indeed, getting all players to agree 
on how to share costs and benefits was identified as one of the major barriers to 
collaborative commerce according to a European Chemical Transport Association 
white paper. 

In this section, we consider a distribution system with multiple retailers that 
may place joint orders and keep inventory at a central warehouse. The retailers 
are interested in this type of cooperation for two reasons. First, retailers can take 
advantage of the risk-pooling effect by delaying the allocation of inventory. Second, 
exploiting economies of scale allows retailers to reduce their costs or increase their 
profits. 

The cost-allocation problem among the retailers can be modeled as a cooper- 
ative game, referred to as an inventory centralization game. We will show that 
under certain conditions, an inventory centralization game has a nonempty core, 
which implies that no group of retailers will be better off by deviating from the 
cooperation. 

11.2.1 Model 

Assume that in the distribution system, there are m warehouses and n retailers. 
The retailers order from the outside suppliers through the warehouses and sell a 
single type of goods in a single period. Let W = {1, 2, • • • , m} and TV = { 1 , 2, • • • , n} 
be the sets of suppliers and retailers, respectively. The retailers are assumed to be 
noncompeting and allowed to make their selling price decisions. Each retailer’s 
demand depends on its own selling price and a common random variable — the 
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market signal, uj. To satisfy their demand, the retailers, taking advantage of 
risk-pooling effects, may form coalitions to place joint orders through the ware- 
houses before observing the market signal, while the inventories are allocated to 
the retailers after the market signal is revealed. Let Zj C W be the set of ware- 
houses that can be used to supply retailer j if she does not cooperate with other 
retailers. If retailer j together with some other retailers decides to form a group 
5, referred to as a coalition, by placing joint orders and sharing inventory, her 
demand can be served by the inventory at any warehouse in U jes^j- 

The sequence of events is as follows. Before observing the realization of the 
market signal, each warehouse places an order by paying an ordering cost of Ci(yi ) 
for an order quantity yi at warehouse i. After the market signal uj is revealed, the 
retailers decide their selling prices Pj(uj), which depend on the market signal uj, 
and all goods at the warehouses are allocated to the retailers; say, Xij{uj ) units 
of goods are shipped from warehouse i to retailer j. The transportation cost of 
sending one unit of goods from warehouse i to retailer j is Sij. For each retailer 
j, if the total amount of goods received from the warehouses is more than the 
realized demand, a per-unit holding cost of for excess inventory is incurred. On 
the other hand, we make the following assumption regarding unsatisfied demand. 

Assumption 11.2.1 Unsatisfied demand at retailer j is filled by an emergency 
order , which incurs a per-unit emergency ordering cost of hj . 

The demand of each retailer is random and depends on the realization of the 
market signal uj and its own selling price. Specifically, we focus on demand func- 
tions of the following forms: 

Assumption 11.2.2 For j E N, the demand function of retailer j given its price 
Pj satisfies 

dj = Dj (jpj , u) := Pj(u) - a j (io)p j , (11.1) 

where otj and f3j are two nonnegative random variables, represented as functions 
of the market signal uj . 

To avoid technical complications, we assume that the sample space Cl of uj is 
finite. However, this assumption can be relaxed if necessary. 

We further assume that p. and pj are the lower and upper bounds of pj{ uj), 
respectively. Thus, the feasible set of retailer j’s price decision pj(-) is given by 

p j = te(0 ; P, < Pj{“) < pj, V u e ft}. 

The inventory centralization problem for a coalition of retailers S C N can be 
formulated as a two-stage stochastic programming model with recourse. In this 
model, yi , i = 1,2 is the first-stage decision variable. After the market 
signal uj is revealed, a recourse decision should be made, which is the amount of 
goods sent from i to j, namely, Xij(u), for all i E U jes^j and j E 5, and the 
selling price Pj(uj). Let Vj(uf) be the total amount of goods received by retailer j. 
For the coalition 5, the objective is to maximize the expected total profit of all 
retailers in S. 
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Denote the maximum expected profit of the coalition S by v(S), which can be 
written as the optimal value of the following two-stage stochastic programming 
problem with recourse: 

V(S) = Max ^2{E[R j (p j (cj),uj) - - Pj{w) 

jes 

— s ijE[Xij(id)]) 

ieu jes Zj jes 

s*t. yi ^ ^ Xjj (id ) — 0, i G U G 
jes 

Vj{uj) — ^ Xijiuj) =0, j 

ieUj^sZj 

Xij{uS) > 0, j G 5, i G U j^s Z j : co G D, 

Pj(-) e Pj, j e 5, 

( 11 . 2 ) 

where the maximization is taken over (^, Xij(-),pj (•), Rj is the realized 

revenue function for a given selling price r and a realization of the marker signal cj: 

l?i(r,w) = r(/?i(w) - aj(u})r), 

and /j represents the inventory holding cost or emergency ordering cost 

/i(x) = ^ + x + + V(-x) + - 



In the above model, the term in the first summation in the objective function 
is the expected revenue minus the expected inventory holding cost and emergency 
ordering cost. The term in the second summation in the objective function is 
the regular ordering cost and the transportation cost. The first constraint implies 
that no warehouse holds inventory. The second constraint specifies that the total 
amount of goods received by a retailer equals the total amount sent to the retailer 
from the warehouses. 

Now the pair (TV, V) with V given by (11.2) for each coalition S C N defines a 
cooperative inventory centralization game. 



11.2.2 Inventory Games with a Linear Ordering Cost 

In this subsection, we assume that the ordering cost q(^) is linear; by slightly 
abusing the notation, we also use q to denote the unit ordering cost. Since the rea- 
lized revenue Rj(jpj(uj),uj) is concave in pj( uj) by Assumption 11.2.2 and fj(x) is 
convex, problem (11.2) is a concave maximization problem with linear constraints, 
which allows us to apply the elegant duality theory for convex minimization prob- 
lems with linear constraints. For this purpose, define the Lagrangian function 
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Ls(y,P,v,x,\,fj,,Tr) = ^ { E l R j(Pj &)><*>) ~ /jtb'H “ Pj( u ) + a jMPjM)]) 
jes 

— ^ ] { c iVi + ^ ] SjjE[xjj (cj)]) 

iEUj^sZj j£S 

+ Y E l X i( CJ )(yi-'52 x ij(. UJ ))) 

iEUj^sZj j£S 

+'52 E ivj{u){ y sjjH - vj(w)))] 
ies ieu^es 

+ Y E Wij(u)Xij{u)} 

jes,ie UjzsZj 

= ^ ) ibj (pj . Vj , i-ij ) ~t~ ^ ) yj(£/[Ai(cu)] Cj) 

jes ieUjesZj 

+ ^ - s»j - A*(w) + Mj(w))], 

jeS,i€Uj es Zj 



where y = (y^ieu^sZ^d = (dj) jeS ,v = (vj) jeS , A = (Ai) ieu . esZ . , p = (Mites, 
7T = (7r»i)i€S,*eU 3 -6sZj) and 

i>j(P3, v 3,»j) = E[Rj(.Pj(u),u) - fjivjiu) - Pj{w) + OLj{u)pj{u)) - Mi(wVi(w)]. 

(11.3) 



Consider the dual function 75 (A, //, 7 r) defined by 

7 s(A,m,tt) = Sup L s (y,p,v,x, A,m,tt) 
s.t. j G S. 

The duality theorem for convex minimization problems with linear constraints 
implies that V (5) is equal to the optimal objective value of the dual problem (see, 
for instance, page 299 of Bertsekas 1995): 



V(S) = Min 7s(A,M,*r) 

s.t. 7 > 0, j G S : i G U j(=.sZj,(jj G fT 



(11.4) 



Let (A*, /i*, 7 r*) be optimal for the dual problem (11.4) with S = N. Then, again, 
the duality theorem implies that 



V(N) = Max Ljy(y,p, v, x. A*, //*, 7 r*) 
s.t. pj(-)ePj, jeN. 



(11.5) 



Define for j G TV, 

lj = Max !pj ( pj , , /i* ) 

S.t. Pi(-) G Pj. 

We claim that (/ 1 , h, • • • , i n ) is in the core of the cooperative game (AT, F). 
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Theorem 11.2.3 The vector l = (/i, h, • • • Jn) is in the core of the cooperative 
game (TV, V). 

Proof. Notice that in the optimization problem (11.5), no constraint is imposed 
on the decision variables yi and Xij(uj). Thus, we must have 

E[\*(uj)] -Ci = 0 , ie tv Zj , uj G 12 , ( 11 . 6 ) 



and 



7r z*j( CJ ) — s ij ~ y*j(uj) — 0, j G TV, i G Uj eN Zj,LV G 12. (11 -7) 



Therefore, 

L s (y,p,v,x,X*,iJ*,w*) = ^2ipj(pj,Vj,n*). 

jes 



This, together with (11.5), implies that 

E h = V (N). 

jeN 

In addition, since (A*,/r*,7r*) is feasible for problem (11.4), 

Et = 1s(X*,P*,tt*)>V(S). 
jes 

Thus, l = (Zi , Z 2 , . . . , Z n ) is in the core of the cooperative game (TV, V). I 

From the above proof, we know that the optimal dual variables (A*, /r*, 7r*) must 
satisfy constraints (11.6) and (11.7). We now provide some intuition of the dual 
variables and the constraints. In the dual, we attempt to allocate the ordering 
cost and the transportation cost to each unit of goods received by the retailers. 
Specifically, let the dual variable fi* (uj) be the charge for each unit of goods received 
by retailer j to compensate for its ordering cost and transportation cost and A*(cj) 
be a charge for each unit of goods sent out by warehouse i to compensate its 
ordering cost if the market signal turns out to be uj. The constraint (11.6) implies 
that the average unit charge by warehouse i should be enough to cover its ordering 
cost Ci. Since (uj) > 0, the dual constraint (11.7) implies that this unit charge 
pL*(uj) at retailer j should be no more than the unit price, A* (cej) , charged by 
warehouse i plus the transportation cost s^. On the other hand, if there is a 
shipment from warehouse i to retailer j, then tt*j(uj) = 0 by the complementarity 
slackness condition, and the dual constraint (11.7) implies that this unit charge 
H*(uj) is enough to compensate for the unit price, A *(cj), charged by warehouse i 
plus the transportation cost s^j. 



222 



11. Supply Chain Competition and Collaboration Models 



11.2.3 Inventory Games with Quantity Discounts 

In this subsection, we assume that the supplier provides quantity discounts to 
encourage large orders, or a third-party carrier provides volume discounts to enc- 
ourage larger shipments. Specifically, we make the following assumption. 

Assumption 11.2.4 We assume thatci(y)/y is nonincreasing. That is, the larger 
the ordering quantity, the lower the average unit ordering cost. 

For technical reasons, we assume that Ci{y) is lower semicontinuous. That is, 
lim y Ci(y) > a(x) for any x. Further, we assume Ci(y) oo as y oo. Under 
these assumptions, problem (11.2) has an optimal solution for any S C N. 

Our assumption on the ordering cost is quite general. Indeed, we don’t require 
Ci(x) to be continuous, monotone, convex, or concave. Moreover, it includes several 
commonly used discounts: incremental discounts and all-units discounts. The con- 
cave ordering cost analyzed in Chen and Zhang (2009) and the less-than-truckload 
(LTL) volume discount function (see Muriel and Simchi-Levi 2003) are also imp- 
ortant special cases. 

Given this general ordering cost structure, unfortunately, the corresponding coo- 
perative game may have an empty core. Indeed, in a special case of the inventory 
centralization games in which price is not a decision variable, Chen and Zhang 
(2009) show that for a distribution system with multiple warehouses, the core 
of the corresponding cooperative game may be empty even if the ordering costs 
involve only fixed costs and demand is deterministic. 

Thus, in this subsection, we focus on inventory centralization games with a single 
warehouse ( N , V). Since we analyze inventory games with a single warehouse, in 
the following analysis, we drop the index associated with the warehouses. In this 
case, the value of a coalition S can be defined as 

V(S)= Max ~c(y)+g(y,s) 
s.t. y > 0, 

where 

g{y,S) = E[g s (y,u)], (11.9) 

with 

gs(y,u)= Max ^ 9j (Pj yXj,u) 

jes 

s.t. y — T/ X j = 0) 

jes 

Xj >0, je S, 

Pj^Po^Pj, jeS, 

and 

gj( Pj ,Xj,uj) = Rj ip , , cj) - fj(x, - Pj{u)) + a, (io)pj ) - sjxj. 

It is clear that given the general quantity discount function c(y), the objective 
function of the above optimization problem is neither convex nor concave. Thus, 



( 11 . 8 ) 
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analyzing it directly appears to be quite challenging. To get around this challenge, 
we construct another inventory centralization game (AT, V^) with a linear ordering 
cost, which is known to have a nonempty core, such that V(S ) > V(S) for any 
S C AT and V(N) = V(N). If this could be done, then we could prove that any 
element in the core of the game (AT, V) is in the core of the game (AT, V). We show 
that this is true by first proving that for Problem (11.8), the bigger a coalition is, 
the larger the optimal ordering quantity should be. For this purpose, we define for 
any given scalar c > 0 an inventory centralization game (AT, Vf) with the ordering 
cost being ex. In this game, for any S C AT, 

Vc(S) = Max -cy + g(y,S) 

s.t. y > 0. 

For any nonempty set S C AT, let y*(S) be the smallest optimal solution of 
problem (11.8) that is guaranteed to exist when c(y) is lower-semicontinuous and 
lim :y ^ +00 c(y) = Too. 

Lemma 11.2.5 For any given S C AT, let y*(S) be the smallest optimal ordering 
quantity for the postponed pricing model (11.8)-(11.9). We have y*(S\) < y*{S 2 ) 
for Si C S 2 . 



Proof. We prove this result by contradiction. Assume that there exist Si , S 2 Q N 
with Si C S 2 such that y*(Si) > y*(S 2 ). Let (x^(uj),p^(uj))j e s 1 be the optimal 
inventory allocation and pricing associated with the optimal ordering quantity 
y*(Si) for problem (11.8)-(11.9) with S = Si. Similarly, let (xj(cu),pj(u;))j e s 2 be 
the optimal inventory allocation and pricing associated with the optimal ordering 
quantity y*(S 2 ) for problem (11.8)-(11.9) with S = S 2 . 

The definition of y*(Si) and 7/*(S 2 ) implies that 

-c(y*(Si)) + 9 s 1 (p 1 ,x 1 ) > -c{y*(s 2 )) + gsAP 3 ^ 3 ) ( 11 . 10 ) 

for any p^(-) E Pj and Xj(-) with 

y*(S2) = ^2 tf(w), V w efl, (ii.ii) 

jeSi 

where for any S C AT, (xj(cv),pj(uj))j e s, 

9 sip, x) = 22 E [9jiPj{u),Xj{u),u)}. 

jes 



Similarly, 

- c(y*{S 2 )) +gs 2 {p 2 ,x 2 ) > ~c(y*(Si)) + gs 2 {p 4 ,x 4 ) 
for any Pj(-) G P- P> all 0 x j(-) with 

V*(Si ) = 22 x j( u )’ V w G 11. 

jes 2 



( 11 . 12 ) 
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Specifically, let 

Pj(oj) = Pj(u),Xj(u>) = x 2 (u),V j G S 2 \ Si,w g Cl. 

This, together with inequality (11.12), implies that 

- c(y*(S 2 )) +gs 1 (p 2 ,x 2 ) > -c(y*(Si)) +gs 1 (p*,x i ) (11.13) 

for any Pj(-) G Pj and xlj(-) ( j G Si) with 

y*(Si) - y*( 5 2 ) = E ( 4 M - *»), Vwefi. (11.14) 

jeSr 

Adding the two inequalities (11.10) and (11.13) together gives us that 

9s 1 (p 1 ,x 1 ) + g Sl (p 2 ,x 2 ) > g Sl (p 3 ,x 3 ) + g Sl (p 4 ,x 4 ) (11.15) 

for any p^(-),p 4 (-) G Pj ( j G Si) (x?(-)) j€Sl satisfying (11.11) and (xf(-)) J€Sl 
satisfying (11.14). 

Define 

y*(5i)-y*(5 2 ) 

1 j V*(S i)-E ieSl ^ 2 H' 

Since 

E ^ E = < y*( s 

jG-Si jG5 2 

we have that A (u) G [0, 1]. For j G Si, let 

xj(u>) = (1 - A(u>))a:](u>) + A(uj)x 2 (uj), 

Pj(v) = (1 - A(w))p}(w) + A (u)pj(u)), 

and 

x|(w) = A (cj)xj(w) + (1 - A (u>))x 2 (u>), 
p 4 (uj) = A(uj)p)(uj) + (l-A(uj))p 2 (uj). 

It is clear that (Xj(-))jeSi satisfies (11.11) and (Xj( m ))jes i satisfies (11.14). In 
addition, 

Xj ((j) + Xj (uj) = x] (uj) + x 2 - (uj) 

and 

Pj(v) +Pj{u) =p){u) +p 2 (u>). 

Thus, the concavity of the realized revenue function Rj implies that 
Rj (Pj (w) , oj) + Rj (pj(u) , uj) > Rj (p) (uj ) , uj) + Rj (p 2 (uj ) , uj ) , 
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and the convexity of fj implies that 



-fjix * H - ft(w) + aj(uj)p3(u})) - - f3j(u) + aj(uj)pj(oj)) 

> -fj( x j M - + Uj(u)p)(u:)) - fj (x? (cv) - fij(oj) + aj(w)p?(uj)). 



Adding the two inequalities together and taking expectation with respect to cj give 
us an inequality that contradicts the inequality (11.15). Thus, y*(Si) < y*(S 2 )- I 
It is also important to point out that Lemma 11.2.5 is independent of how the 
retailers’ demands are correlated. In addition, this result is true for any ordering 
cost as long as the relevant quantities are well defined. 

Lemma 11.2.6 There exists a scalar c* such that for any S C N, Vc*(S) > V(S) 
and Vc* (N) = V(N). 



Proof. We consider two cases. First, assume that y*(N) = 0. Lemma 11.2.5 implies 
that y*(S) = 0 for any nonempty set S C N. If we choose a sufficiently large c*, 
say c* > Max maxjpj, qj}, it is easy to see that 0 is also an optimal solution 
for problem ma x y > 0 — + g(y,S). Thus, in this case, Vc*(S ) = V(S) for any 
S C N. 

We now assume that y*(N) > 0. For simplicity of notation, let y* = y*(N). 
Upon denoting c* = c(y* )/y*, we have that 

V(N) = —c(y*) + g(y*, N) 

= —c*y* + g(y*, N) 

< max y >o -c*y + g(y, N ) 

= V c * (N). 

Since y* > 0, we have that 

V(N) = max y > 0 ~c(y) -hg(y,N) 

> g(0,N) 

= lim^oo Vc(N). 



The continuity of Vs(N ) as a function of c, together with the above two inequalities, 
implies that there exists a c* such that V(N) = Vc*(N). 

Define x = sup{x > 0 : c(x)/x > c*}. Let y* be the smallest optimal solution 
for the problem min^o — c*y + g(y, N). We claim that y* < x. 

Assume to the contrary that y* > x . The definition of x together with the 
monotonicity of c(x)/x implies that c(y*)/y* < c*. Thus, 



V a *(N) = 

< -c(y*)+g(y*,N) 

< -c(y*)+g(y*,N) 

= V(N), 



which contradicts the fact that V(N) = Vz*(N). Thus, y * < x. 
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Define a new function c(x) as follows: 

x _ J c*x, for 0 < x < x, 

' ' \ c(x), otherwise . 

The following properties of c(x) will be useful for our analysis. First, c(x) < c(x) 
for any x. This follows directly from the monotonicity of c{x)/x. Second, c(x) 
preserves the lower semicontinuity of c(x). To show this, it suffices to prove that 
c(x) is lower- semicontinuous at x = x. Notice that 

lim c(y) = lim c(y) > c(x) = c(£), 

y^x+ y^rx+ 



while 



lim c(y) = c * x > lim xc(y) / y > c(x) = c(x ) , (11.16) 

y^x- y^x+ 

where the first inequality and the second inequality follow from the definition of x 
and the lower semicontinuity of c(x), respectively. Notice that (11.16) implies that 

c(y) < c*r/, for any 0 < y < x. (11.17) 

Third, y* is also optimal for the problem max y >o —c(y) + g(y,N). Indeed, the 
definition of y* together with the fact y* < x implies that for any 0 < y < x, 



-Kv* ) + g(v*,N) > —c*y* + g(y*, N ) > -c*y + g(y, N ) = -c(y) + g(y, N), 
where the first inequality follows from (11.17). For y > £, 



c(y*)+g(r,N) > 

> 



-c*y* +g(y*,N) 
-c(y*)+g(y*,N) 
-c(y) + g(y,N) 
-c(y) + g(y,N), 



where the first equality follows from the definition of y* and c*. Thus, y* is also 
optimal for the problem max y > 0 —c(y) + g(y, N). 

We are now ready to prove that for any SciV, Vc*(S) > V(S). Let y*(S) be 
the smallest optimal solution for the problem max y >o ~c(y) + g(y , S). Notice that 
c(x) is lower-semicontinuous. Hence, y*(S) is well defined. Lemma 11.2.5 implies 
that for any S C AT, y*(S) <y*<x. 

We claim that 

c*y*(S)=c(y*(S))- (11.18) 

Indeed, if y*(S) < x, we have from the definition of c(-) that c*y*(S ) = c(y*(S)). 
On the other hand, if y*(S) = x and c*y*(S) > c(y*(S)) = c(y*(S)), we have that 
y* = y*(S) and 



V C *(N) = —c*y* + g(y*, N) < —c(y*) + g(y*, N) < V(N), 



which is a contradiction. Thus, in this case, (11.18) follows from (11.17). 
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Finally, we have that 



Vc*{S) = max y > 0 -c*y + g(y,S) 

> -&*V*(S)+g(y*(S),S) 

-c(r(s))+ g (r(s),s) 

= max y > 0 -c(y) + g(y, S ) 

> max y >o -c(y) + g(y, S ) 
= V(S), 



where the second equality follows from (11.18) and the last inequality from the 
fact that c(y) < c(y) for any y > 0. The proof is now complete. I 

We can now prove that the core of (AT, H) is nonempty. 



Theorem 11.2.7 Under Assumption 11.2.4, the inventory centralization game 
(AT, H) with the characteristic value function defined by (11.8)-(11.9) has a non- 
empty core. Let (AT, Vs * ) be the inventory centralization game with marginal ord- 
ering cost c* , where c* is defined in Lemma 2. Then any element in the core of 
(AT, Vg*) is also in the core of (N,V). 



Proof. The proof is straightforward. Let l = (Ij)jeN be an element in the core of 
(AT, Vs*). We have that for any S C AT, 

v e . (S) >V(S). 
jes 



In addition, 

Y / l j = V & .(N) = V(N). 

jeN 

Hence, l = (Ij)jeN is also in the core of (AT, V). Since (AT, Vs* ) is an inventory 
centralization game with a linear ordering cost, Theorem 11.2.3 implies that it has 
a nonempty core. Thus, (AT, V) has a nonempty core as well. I 

Our approach to prove the nonemptiness of the core of a cooperative game 
(AT, V) with quantity discount suggests a way to find an allocation in the core in 
three steps. First, solve 



V ( N ) = max -c(y) + g(y, N). 
y> o 

Second, given V(N), find a c* such that 

V(N) = max-c*y-hg(y,N). 
y> o 



Third, find an allocation in the core of the inventory centralization game (AT, Vs * ) 
with a linear ordering cost by employing the duality approach in Sect. 11.2.2. 
Theorem 11.2.7 implies that this allocation is in the core of (AT, V). 
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11.3 Exercises 

Exercise 11.1. For the linear demand, the exponential demand, the 
Cobb-Douglas demand, the constant elasticity of substitution demand, and the 
Logit demand in Sect. 11.1, provide conditions under which the payoff functions 
satisfy the diagonally dominant conditions. 

Exercise 11.2. Show that the inventory centralization games are not convex in 
general. 

Exercise 11.3. The inventory centralization game analyzed in Sect. (11.2) ass- 
umes that pricing decisions are made after the market signal is revealed. Do we 
have similar results if pricing decisions are made before the realization of the 
market signal? 

Exercise 11.4. Consider a special case of the inventory centralization game with 
a single warehouse, the newsvendor game, in which pricing decisions are fixed, 
the ordering cost is linear, and the retailers have identical transportation costs Sj, 
inventory holding costs hj ~ , and emergency ordering costs hj . Simplify the dual 
derived in Sect. (11.2.2). Find an optimal solution of the dual in closed form and 
an associated (dual-based) core allocation. Does the dual-based core allocation 
satisfy any of the monotonicity properties in Sect. 3.2? 



12 

Procurement Contracts 



12.1 Introduction 

The inventory models discussed in Chap. 9 focus on characterizing the optimal 
replenishment policy for a single facility given some assumptions, such as lead 
time and yield, of its supplier. This of course emphasizes the need, in many cases, 
to develop direct relationships with suppliers. These relationships can take many 
forms, both formal and informal, but often, to ensure adequate supplies and timely 
deliveries, buyers and suppliers typically agree on supply contracts. These con- 
tracts address issues that arise between a buyer and a supplier, whether the buyer 
is a manufacturer purchasing raw materials from a supplier or a retailer purchas- 
ing manufactured goods from a manufacturer. In a supply contract, the buyer and 
supplier may agree on 

• pricing and volume discounts, 

• minimum and maximum purchase quantities, 

• delivery lead times, 

• product or material quality, 

• product return policies. 

As we will see, supply contracts are very powerful tools that can be used for far 
more than ensuring adequate supply and demand for goods. 

D. Simchi-Levi et al., The Logic of Logistics: Theory, Algorithms, and Applications 229 
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To illustrate the importance and impact of different types of supply contracts on 
supply chain performance, consider a typical two-stage supply chain consisting of 
a retailer and a supplier. In such a supply chain, the retailer places orders trying 
to maximize its own profit and the supplier reacts to the orders placed by the 
retailer. This process is referred to as a sequential supply chain since decisions are 
made sequentially. Thus, in a sequential supply chain, each party determines its 
own course of action independent of the impact of its decisions on other parties; 
clearly, this cannot be an effective strategy for supply chain partners. 

It is natural to look for mechanisms that enable supply chain entities to move 
beyond this sequential process and toward global optimization. Of course, this 
may be quite difficult since, in a typical supply chain, different parties may have 
different, sometimes even conflicting, objectives. Thus, it is important to identify 
mechanisms that maximize the efficiency of the supply chain while allowing differ- 
ent parties to focus on their own objectives. One way to achieve this goal is to use 
contracts specifying the transactions between supply chain parties such that every 
party’s objective is aligned with the objective of the entire supply chain. We will 
refer to such a contract as a contract that coordinates the supply chain. 

To illustrate how supply contracts can be used to coordinate the supply chain, we 
investigate in this chapter a simplified supply chain consisting of two risk-neutral 
decision makers, a supplier and a retailer. The retailer faces uncertain demands 
and needs to procure a certain quantity of a single product from the supplier. 
The supplier then produces and delivers the order to the retailer before demand 
is realized. The two parties negotiate and form a contract regarding the terms of 
the transactions. 

A simple example of such a contract is the wholesale contract that we have 
seen in the analysis of the newsvendor problem; see Chap. 9, Sect. 9.2, in which 
the supplier specifies a wholesale price, while the retailer places an order to the 
supplier and the payment is proportional to the quantity purchased by the retailer. 
Unfortunately, as we will see in the next section, this simple wholesale contract 
does not coordinate the supply chain in general. 

Several supply contracts have been proposed to achieve system efficiencies. 
Among those contracts, the buy-back contracts and the revenue-sharing contracts 
are commonly used in some industries due to their effectiveness and simplicity. 
In fact, under the setting to be specified later on in this chapter, the two contracts 
coordinate the supply chain; that is, these contracts allow supply chain partners 
to achieve global optimization, in other words, maximize supply chain expected 
profit. 

Furthermore, in these contracts, the retailer’s optimal strategy, namely, the opt- 
imal ordering quantity, together with the supplier’s optimal strategy, namely, the 
optimal cost parameters specified in the contracts, consists of a Nash equilibrium. 
Thus, neither the retailer nor the supplier could increase its profit by unilaterally 
deviating from its optimal strategies. 

Interestingly, the buy-back contracts and the revenue-sharing contracts are 
shown to be equivalent under our model setting. The literature on supply contracts 
that coordinate the supply chain system is quite extensive and is still expanding. 
We refer the reader to the review paper by Cachon (2003) for more details. 
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Of course, effective supply contracts are important not only in the retail industry. 
In the electronics industry, there has been a marked increase in purchasing volume 
as a percentage of the firm’s total sales. For instance, between 1998 and 2000, out- 
sourcing in the electronic industry increased from 15% of all components to 40%. 
This increase in the level of outsourcing implies that the procurement function 
becomes critical for an original equipment manufacturer (OEM) to remain in con- 
trol of its destiny. As a result, many OEMs focus on closely collaborating with the 
suppliers of their strategic components. In some cases, this is done using effective 
supply contracts that try to coordinate the supply chain. 

A different approach has been applied by OEMs for nonstrategic components. 
In this case, products can be purchased from a variety of suppliers, and flexibility 
to market conditions is perceived as more important than a permanent relationship 
with the suppliers. Indeed, commodity products, for instance, electricity, computer 
memory, steel, oil, grain, or cotton, are typically available from a large number of 
suppliers and can be purchased in spot markets. Because these are highly stan- 
dard products, switching from one supplier to another is not considered a major 
problem. 

Thus, in this chapter, we also introduce and analyze portfolio contracts based on 
the recent work of Martfnez-de- Albeniz and Simchi-Levi (2005). In these contracts, 
the buyer signs a portfolio of supply contracts, contracts that provide the buyer 
with the appropriate tradeoff between price and flexibility. 



12.2 Wholesale Price Contracts 

In a wholesale contract, the supplier specifies a wholesale price and in return, 
the retailer decides how much to order from the supplier. Specifically, when the 
retailer places an order, its payment to the supplier is proportional to the quantity 
it orders. Thus, in this case, the retailer is facing a newsvendor problem and chooses 
the optimal ordering quantity according to the newsvendor model we analyzed in 
Chap. 9 Sect. 9.2. Of course, the supplier anticipates the reaction of the retailer 
and takes it into account when deciding its wholesale price. This is the so-called 
Stackelberg game between the supplier and the retailer, in which the supplier is 
the leader and the retailer is the follower. 

The setting in this model is as follows. The retailer places an order from the 
supplier before the realization of the uncertain demand and sells the product to its 
customers at a unit price r. Let F be the cumulative distribution function of the 
demand. The function F is assumed to be strictly increasing and differentiable. 
For simplicity, we assume that unsatisfied demand is lost and there is no penalty 
cost for lost sales. In addition, leftover inventory is salvaged with unit price v. 
Finally, we assume that the supplier has no production capacity limit, and its unit 
production cost is c with v < c < r. 

Before proceeding to analyze the Stackelberg game between the supplier and 
the retailer, we first discuss the optimal production quantity of the entire system 
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assuming that the supplier and the retailer belong to a centralized system. In this 
case, the objective is to maximize the system’s expected profit. Given the produc- 
tion quantity g, the profit for the total supply chain is 

7 T° (g) = -cq + rE D [min(g, D)} + vE D [max(g - D, 0)] 

= (r — c)q — (r — u)F£>[max(g — D, 0)]. 

This is exactly the classical newsvendor problem analyzed in Chap. 9, Sect. 9.2. 
Thus, the optimal production quantity for the total supply chain is 




where F _1 is the inverse function of the cumulative distribution function F. 

We now analyze the Stackelberg game between the supplier and the retailer. 
Assume for now that the unit wholesale price of the supplier is w. As we already 
noticed, the retailer is facing a newsvendor model. Again from the analysis of the 
newsvendor model, we can determine the optimal ordering quantity for the retailer 
as follows: 

\r — v J 

Of course, here we assume r > w > v to avoid trivial cases. Notice that q(w) > g° 
only if w < c. However, this implies that the supplier makes a nonpositive profit. 
Thus, the supplier prefers a higher wholesale price, and in this case, the retailer 
always tends to order less than g°, the quantity that is optimal for the entire 
supply chain. We refer to this behavior as double marginalization. Of course, 
this behavior has an intuitive explanation. Since the retailer bears all the risk for 
overstocking, it has no incentive to order more and thus tries to reduce its risk 
exposure by reducing inventory levels. 

As we already pointed out, the supplier anticipates this behavior of the retailer 
when setting its wholesale price. From (12.1), there is a one-to-one correspondence 
between the optimal ordering quantity of the retailer and the wholesale price set 
by the supplier, since F is strictly increasing. Therefore, given the optimal ordering 
quantity of the retailer g, the wholesale price is 

w(q) = r — (r — v)F(q). 

The objective of the supplier is to maximize its own profit, which can be written 
as a function of the ordering quantity of the retailer: 

7r s (q) = ( w(q ) - c)q = ((r - c) - (r - v)F(q))q. 

Of course, if the cumulative distribution function F is too general, there is no 
guarantee that the supplier has a unique optimal wholesale price. Hence, we focus 
on demands with increasing generalized failure rate (IGFR) distributions, namely, 
distributions such that gF'(g)/(l — F(q)) is increasing. Notice that several com- 
monly used distributions, such as the normal distribution and the exponential 
distribution, are IGFR distributions. 
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We now show that for IGFR demand distributions, the optimal wholesale price 
of the supplier is unique. First, observe that the first-order optimality condition 
implies that the retailer’s ordering quantity, associated with the supplier optimal 
wholesale price, satisfies 

Kiq) = -( c -v) + (r- v)(l - F(q) - F'(q)q) = 0, 
or 

qF'(q ) c — v 1 

1 — F(q) r — v 1 — F(q) 

Notice that the left-hand side of the above equation is decreasing in q while the 
right-hand side of the equation is increasing in q. Hence, there is a unique solution 
q* for Equation (12.2), and therefore w(q*) is the unique optimal wholesale price 
of the supplier. Furthermore, it is easy to verify that tt '(g) < 0 for q < q* and 
7r' (g) > 0 for q > q*. In other words, tt s (q) is decreasing for q < q* and increasing 
for q > q* . Thus, i r 3 (q) is unimodal. 

In summary, for the wholesale contract, there exists a unique Nash equilibrium 
for the Stackelberg game between the supplier and the retailer when the demand 
distribution is IGFR. In addition, in such a contract, the retailer always orders 
less than the quantity that would be optimal for the entire supply chain due to 
the fact that it bears all the risks of overstocking. Thus, the wholesale contract 
does not coordinate the supply chain. 



12.3 Buy-Back Contracts 

The previous discussion reveals that wholesale contracts do not coordinate the 
supply chain, since the retailer bears all the risks of overstocking and tends to 
order less than the amount that would be optimal for the entire system. Thus, 
one might expect that the retailer is willing to order more and hence improve the 
performance of the supply chain if the supplier would share some of its risks. 

Buy-back contracts provide such a mechanism for the supplier to share the risks 
with the retailer. In such a contract, the supplier specifies a wholesale price w b and 
a buy-back price b. This contract is similar to the wholesale price contract; that is, 
the retailer orders from the supplier according to a wholesale price w b . However, 
one significant difference is that in addition to a unit salvage value v for unsold 
items, the retailer can get a refund from the supplier for a unit price b. 

Given a wholesale price w b , a buy-back price 6, and an order quantity g, the 
retailer’s expected profit is 

7 T h r (w h , b , q) = -w b q + rE D [mm(q, D)} + (b + v)E D [max(q - D, 0)] 

= (r — w b )q — (r — b — v)Ed [max(g — D, 0)] . 

Consider now a wholesale price w b and a buy-back price b satisfying the following 
requirements: 

r — w b = A (r — c ) and r — b — v = A (r — v) 
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for some A £ [0, 1], or alternatively, 

Wb = r — A (r — c ) and b = (1 — A )(r — v). 

This implies that the expected profit of the retailer is given by 

-K h r (w b ,b,q) = \TT°(q). 

Hence, the optimal order quantity of the retailer equals q ° , the optimal production 
quantity of the entire supply chain. Similarly, the expected profit of the supplier 
is given by 

n b s (w b ,b,q) = (1 - X)n°(q). 

Thus, the supplier’s optimal production quantity is also equal to q°. Therefore, 
the system’s expected profit is maximized and the buy-back contract coordinates 
the supply chain. Furthermore, in this case, the retailer receives A of the system’s 
expected profit and the supplier seizes (1 — A) of the system’s expected profit. 



12.4 Revenue-Sharing Contracts 

A different contract that allows for risk sharing between suppliers and retailers is 
the so-called revenue- sharing contract. In a revenue- sharing contract, the retailer 
and the supplier agree on the wholesale price, typically a discounted wholesale 
price, and in return the supplier receives a given fraction of the revenue from each 
unit sold by the retailer. Of course, since the supplier receives some of the revenue, 
it has an incentive to reduce the wholesale price and hence increase the amount 
ordered by the retailer. 

Assume that the wholesale price is w r and the supplier receives a fraction (1 — 0) 
of the retailer’s revenue. Thus, the retailer’s profit is 

7T r r (w r , 0, q) = -w r q + (f)(rE D [min(q, D)} + vE D [max(q - D , 0)]) 

= (c/yr — w r )q — </)(r — v)E£)[max(q — D,0)]. 

If we choose (j) and w r such that 

4>r — w r = A (r — c) 



and 



<fi = \ 



for some A £ [0, 1], then 



K( w r,<P,q) = At T°(q). 



Similarly, the supplier’s expected profit is given by 



K( w b,b,q ) = (1 - A)t T°(q). 
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Thus, both the retailer’s optimal ordering quantity and the the supplier’s optimal 
production quantity equal q ° , the optimal production quantity for the entire supply 
chain. Hence, the system’s expected profit is achieved, and the revenue- sharing 
contract coordinates the supply chain. 

Furthermore, if the wholesale price is w r ( A) = Ac and the retailer shares a 
fraction 0(A) = A of its expected revenue with the supplier, the retailer receives 
a fraction A of the system’s expected profit and the supplier seizes (1 — A) of the 
system’s expected profit. 

Notice that both the buy-back contract with parameters (A), 6(A)) and the 
revenue-sharing contract with parameters (w r ( A), 0(A)) coordinate the supply 
chain and have the same allocation of the system’s expected profit to the sup- 
plier and the retailer. 

In fact, a revenue- sharing contract with parameters (w r ( A), 0(A)) is equivalent 
to the following contract: The wholesale price is w r (A) + (1 — A)r, and the retailer 
receives r for each sold unit and gets a refund equal to (1 — A )r — (1 — 0(A ) )v 
from the supplier for each salvaged unit. It is easy to verify that this is exactly the 
buy-back contract with parameters (^(A), 6(A)). 

The following example illustrates the impact of supply contracts in practice. 

Until 1998, video rental stores used to purchase copies of newly released movies 
from the movie studios for about $65 and rent them to customers for $3. Because 
of the high purchase price, rental stores did not buy enough copies to cover peak 
demand, which typically occurs during the first 10 weeks after a movie is released 
on video. The result was a low customer service level; in a 1998 survey, about 20% 
of customers could not get their first choice of movie. Then, in 1998, Blockbuster 
Video entered into a revenue- sharing contract with the movie studios in which 
the wholesale price was reduced from $65 to $8 per copy, and, in return, studios 
were paid about 30-45% of the rental price of every rental. This revenue- sharing 
contract had a huge impact on Blockbuster revenue and market share. Today, 
revenue sharing is used by most large video rental stores; see Cachon and Lariviere 
(2005). 



12.5 Portfolio Contracts 

A recent trend for many industrial manufacturers has been outsourcing; firms are 
considering outsourcing everything from production and manufacturing to the pro- 
curement function itself. Indeed, in the mid-1990s, there was a significant increase 
in purchasing volume as a percentage of the firm’s total sales. Between 1998 and 
2000, outsourcing in the electronics industry increased from 15% of all components 
to 40%. 

Of course, the increase in the level of outsourcing implies that the procure- 
ment function becomes critical for a manufacturer to remain in control of its 
destiny. Thus, an effective procurement strategy has to focus on both driving costs 
down and reducing risks. These risks include both inventory and financial risks. 
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By inventory risks, we refer to inventory shortages, while financial risks refer to 
the purchasing price, which is uncertain if the procurement strategy depends on 
spot markets. 

A traditional procurement strategy that eliminates financial risk is the use of 
fixed commitment contracts. These contracts specify a fixed amount of supply to 
be delivered at some point in the future; the supplier and the manufacturer agree 
on both the price and the quantity delivered to the manufacturer. Thus, in this 
case, the manufacturer bears no financial risk while taking huge inventory risks 
due to uncertainty in demand and the inability to adjust order quantities. 

One way to reduce inventory risk is through option contracts, in which the buyer 
prepays a relatively small fraction of the product price up-front, in return for a 
commitment from the supplier to reserve capacity up to a certain level. The initial 
payment is typically referred to as a reservation price or premium. If the buyer does 
not exercise the option, the initial payment is lost. The buyer can purchase any 
amount of supply up to the option level, by paying an additional price, agreed to 
at the time the contract is signed, for each unit purchased. This additional price 
is referred to as the execution price or exercise price. Of course, the total price 
(reservation plus execution price) paid by the manufacturer for each purchased 
unit is typically higher than the unit price in a fixed commitment contract. 

Evidently, option contracts provide the manufacturer with the flexibility to 
adjust order quantities depending on realized demand, and hence these contracts 
reduce inventory risk. Thus, these contracts shift risks from the manufacturer to 
the supplier since the supplier is now exposed to customer demand uncertainty. 
This is in contrast to fixed commitment contracts in which the manufacturer takes 
all the risk. 

Thus, consider a single-period model in which the manufacturer can procure 
a single product from multiple sources. For example, consider automotive man- 
ufacturing companies purchasing steel or PC manufacturers procuring memory 
units. 

The manufacturer faces stochastic demand D and sells the finished product at 
a unit selling price r. Unsold items have a unit salvage value v . Most importantly, 
we assume that there are a total of n suppliers and before the planning horizon, 
the retailer signs an option contract with each supplier. That is, the manufacturer 
reserves capacity Xi with the ith supplier for a reservation cost vi per unit of 
capacity reserved and pays an execution fee of Wi for each unit ordered from the 
supplier, after demand is realized. Thus, the procurement strategy of the retailer 
is a portfolio contract consisting of n option contracts with parameters (vi, Wi,Xi). 

The class of portfolio contracts contains several widely used contracts. This 
includes, for instance, long-term contracts, buy-back contracts, and flexibility con- 
tracts. A long-term contract specifies a fixed amount of supply, x, to be delivered 
at a predetermined time in the future for a given price, v. Thus, it is equivalent to a 
portfolio contract consisting of only one option contract with parameters (v, 0, x), 
that is, with positive reservation price and zero execution cost. In the long-term 
contract, the buyer bears all the risks of overstocking or understocking due to 
uncertain demand and its inability to adjust order quantity. 
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A flexibility contract specifies a fixed amount of supply, x, for a given price, 
v. In addition, in this contract, the amount to be delivered and paid can differ 
from the specified quantity by no more than a given percentage, say a, deter- 
mined upon signing the contract. That is, the order quantity is within the interval 
[(1 — a)x, (1 -\- a)x\. The flexibility contract is equivalent to a portfolio contract 
consisting of a long-term contract with parameters (f),0, (1 — a)x) and an option 
contract with parameters (0, h, 2ax). 

Given that the retailer has to procure q units from the suppliers, it can choose 
an appropriate combination of suppliers so that its cost is minimized. Let R(q) be 
the optimal cost for procuring q units from the n suppliers. We have 

R(q) = Y!i = 1 v i x i + min Yh = 1 W i ( li 

st f EILi ®=9. (12-3) 

\ 0 < q% <Xi, for alii = 1, 2 , . . . , n. 

It is easy to prove that R(q) is a convex piecewise linear function of q. 

Given an initial inventory level / and the order quantity q , the buyer’s expected 
profit is 

f(I,q) = G(I + q)-R(q), 

where 

G(q) = rE[mm(q, D)] + vE[ma,x(0, q — D)]. 

In the following, we characterize the optimal replenishment policy for the ret- 
ailer. First, we present a result that illustrates how the optimal order quantity 
changes monotonically as a function of the initial inventory level when the retailer’s 
ordering cost function is convex while its revenue function is concave. 

Theorem 12.5.1 Assume that the ordering cost function R is convex and the 
revenue function G is concave. Moreover, f(I,q) oc for q — > oo for any I. 
Then there exists a function q*(I) solving 

max f(I,q) (12.4) 

<?>0 

such that q*(I) is nonincreasing and I + q*(I) is nondecreasing. 

Proof. First, observe that q is an optimal solution for the optimization prob- 
lem (12.4) if and only if q' = — q is optimal for the following problem: 

max g(I, q') := G(I - q') - R(-q’). (12.5) 

q '< 0 

Let q'(I) = min{ q' < 0 | q' solves (12.5)}. Since G is concave, Theorem 2.2.6 
implies that g(I, q') is supermodular. Therefore, from Theorem 2.2.8, we have that 
q'(I) is nondecreasing. Thus, q*(I) = —q r {I) solves (12.4) and is nonincreasing. 

To prove the remaining part of the theorem, observe that q is an optimal solution 
for the optimization problem (12.4) if and only if V == I + q is optimal for the 
following problem: 

max g(I,I') := G(I') - R(I' - I). 



( 12 . 6 ) 
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Since R is convex, Theorem 2.2.6 implies that #(/,/') is supermodular. 
Therefore, from Theorem 2.2.8 and the definition of g*(J), we have that I' (I) = 
/ + q*(I) is nondecreasing. I 

The above result implies that the optimal ordering quantity is a nonincreasing 
function of the initial inventory level, while the end period inventory level, / + 
q*(I) — E[D], is a nondecreasing function of the initial inventory level. 

Finally, we characterize the structure of the optimal ordering policy of the ret- 
ailer when a portfolio contract is employed by the retailer. As we already pointed 
out, the order cost function R(q) is convex piecewise linear. In fact, without loss 
of generality, assume that w\ < W 2 < ... < w n . Define zo = 0 and Zi = Y^j=i x j 
for i = 1, 2, . . . , n. Then 



n i—1 

R(q) = T v j x j + 'Yh w i x i + W M - z i- i)j for q € |>i-i,Zi]. 

3=1 3=1 

Hence, for q G (zi_i, z»), = w i: and for q = Zi, dR(q) = [wi,w i+ 1 ]. 

Theorem 12.5.2 If the ordering cost function R(q) is given by (12.3), then there 
exist inventory levels fi (i = 1,2, . . . ,2n + 1) with 

OO = fo > /i > . . . > f 2n > /2n+l = 0 



such that 

(a) For I £ [/2ij/2i-i); it is optimal to set I f = I + q to a constant level such 
that Wi £ dG(I + q ). 

(b) For I £ [/ 2 i+i, f2i), it is optimal to set the ordering quantity q to the constant 
level Zi. 



Proof. For i = 1, 2, . . . , n, let 

qi = max{< 2 * > 0 | q* maximizes G(q) — Wiq subject to q > 0}. 

Theorem 2.2.4 and Theorem 2.2.8 imply that qi < qi - 1 for i = 2, 3, . . . , n. 

Let fo = OO, /2n+l = 0? 

hi- 1 = max(gi - ^_i,0), and hi = max(<^ - ^,0),i = 1,2, . . . ,n. 

Then oo = fo > fi > . . . > hn F hn+i = 0- We claim that fi(i = 0, 1, ... , 2n + l) 
satisfies parts (a) and (b). 

First, notice that for I £ [hu hi-i) 7^ 05 we have qi > 0 and q = qi — I £ 
(zi-i,Zi\. The first-order optimality condition implies that Wi £ dG{qf). Hence, 
q*(I) = qi — I is optimal for problem (12.4), since we have 0 £ d(G(I + q) — 
R(q))\q= q *(i)- Thus, part (a) is true. 

On the other hand, for I £ [hi+i,hi) with i > 1, we claim that the opti- 
mal ordering quantity q*(I) = Zi. In fact, observe that for I = hi, we have 
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q*(I) = qi — I = Z{ and for / = f 2 i+i > 0, we have q*(I) = qi + 1 — I = Zi. Thus, 
Theorem 12.5.1 implies that for / E [/ 2 i+i,/ 2 z), q*(I) = z i is optimal. Finally, for 
T > , it is clear that q*(I) = 0 is optimal. Hence, part (b) holds. I 

Figure 12.1 illustrates the structure of the optimal ordering policy identified in 
Theorem 12.5.2 for a case with n = 2. 



order-up-to level 




order quantity 




FIGURE 12.1. Illustration of the structure of the optimal ordering policy 



12.6 Exercises 



Exercise 12.1. Prove that the normal distribution and the exponential distribu- 
tion have increasing generalized failure rate (IFGR). 

Exercise 12.2. As we have shown in Sect. 12.2, wholesale contracts do not co- 
ordinate the supply chain in general. Now assume that the supplier is willing to 
provide an all-unit quantity discount. Design an all- unit quantity discount con- 
tract coordinating the supply chain; that is, find a per-unit wholesale price w(q) 
as a decreasing function of the order quantity q such that the optimal ordering 
quantity of the retailer and the optimal production quantity of the supplier equal 
the optimal production quantity of the whole system. 
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Exercise 12.3. Show that a buy-back contract is a special case of portfolio 
contracts. 

Exercise 12.4. Show that Theorem 12.5.2 implies the optimality of a modified 
base stock policy. In such a policy, there exist inventory target levels bi > 0 (i = 
1, • • • , n) with bi < max(0, bi+% — Xi+f) for i = 1, • • • , n — 1, such that it is optimal 
to order nothing if I > bi , order bi — I if I G [max(0,6^ — a^),^], or order Xi 
otherwise. 

Exercise 12.5. Consider a single manufacturer and a single supplier. Six months 
before demand is realized, the manufacturer has to sign a supply contract with 
the supplier. Let D be a random variable representing demand and f(D ) be the 
demand density function. Let p be the selling price, that is, the price at which the 
manufacturer sells products to consumers. 

The sequence of events is as follows. Procurement contracts are signed in Febru- 
ary and demand is realized during a short period of 10 weeks that starts in August. 
Components are delivered from the supplier to the manufacturer at the beginning 
of August, and the manufacturer produces items to customer orders. Thus, we can 
ignore any inventory holding cost. We will assume that unsold items at the end of 
the 10- week selling period have zero value. Finally, assume that the manufacturer 
can also purchase additional items in the spot market. Let s be a random variable 
representing the per-unit spot market price and f(s ) be its density function. The 
objective is to identify a procurement strategy so as to maximize expected profit. 

Assume the supplier offers an option contract in which the per-unit reservation 
price is v and the per-unit execution price is w. Given the existence of the spot 
market, how much capacity should the manufacturer reserve with the supplier 
when the contract is signed in February? 



13 

Process Flexibility 



13.1 Introduction 

For many manufacturing firms, the ability to match demand and supply is key 
to their success. Failure to do so could lead to loss of revenue, reduced service 
levels, negative impact on reputation, and decline in the company’s market share. 
Unfortunately, recent developments, such as intense market competition, prod- 
uct proliferation, and the increase in the number of products with a short life 
cycle, have created an environment where customer demand is volatile and unpre- 
dictable. In such an environment, traditional operations strategies such as building 
inventory, investing in capacity buffers, or increasing committed response time to 
consumers do not offer manufacturers a competitive advantage. Therefore, many 
manufacturers have started to adopt an operations strategy known as process flex- 
ibility to better respond to market changes without significantly increasing cost, 
inventory, or response time (see Simchi-Levi 2010). 

Process flexibility is defined as the ability to “build different types of products 
in the same manufacturing plant or on the same production line at the same 
time” (Jordan and Graves 1995). For example, in “full” (process) flexibility, each 
plant is capable of producing all products. In this case, when the demand for one 
product is higher than expected while the demand for a different product is lower 
than expected, a flexible manufacturing system can quickly make adjustments by 
shifting production capacities appropriately. By contrast, in a “dedicated” strategy 
(sometimes called “no flexibility”), each plant is responsible for a single product 
and hence does not have the same ability to match supply with demand. 
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Because of its effectiveness in responding to uncertainties, process flexibility has 
gained significant attention in several industries, in particular, in the automotive 
industry. Evidently, it is often too expensive to achieve a high degree of flexi- 
bility, for example, full flexibility, and as a result, sparse or partial flexibility is 
implemented instead. One set of sparse flexibility designs is the 2-flexibility des- 
igns. A flexibility design is a 2-flexibility design if each plant produces exactly two 
products and demand for each product can be satisfied from exactly two plants. 

Of course, there are many ways to implement sparse designs, and the challenge is 
to identify an effective one. An important concept analyzed in the literature and 
applied in practice by various companies is the concept of the long chain. The 
first to observe the power of the long chain were Jordan and Graves (1995), who, 
through empirical analysis, showed that the long-chain design can provide almost 
as much benefit as full flexibility. In particular, Jordan and Graves (1995) found 
that for randomly generated demand, the expected amount of demand that can 
be satisfied by a long-chain design is very close to that of a full flexibility design. 

Though the analysis and results can be extended to more general settings, for 
simplicity, we focus on balanced manufacturing systems, that is, manufacturing 
systems with an equal number of plants and products, and each plant has a equal 
capacity. Given a balanced manufacturing system, a flexibility design srf is repre- 
sented by the arc set of a directed bipartite graph, where an arc from plant node 
i to product node j implies that plant i is capable of producing product j. For 
example, if srf is a dedicated design, then srf has exactly n arcs such that each 
plant node is incident to one arc and each product node is incident to one arc. 
By contrast, if srf is a full flexibility design, then srf has arcs connecting every 
plant node to all product nodes. 

Because srf is represented by a bipartite graph, applying standard graph theory 
notation, we define an undirected cycle in srf to be a set of arcs that forms a 
cycle when the arc directions are ignored. A flexibility design srf is a long chain 
if its arcs form exactly one undirected cycle containing all plant and product 
nodes (see Fig. 13.1 for an example). A closed chain is defined as an induced 
subgraph in srf that forms an undirected cycle, while an open chain is an induced 
subgraph in srf that forms an undirected line (one arc less than an undirected 
cycle). Figure 13.1 presents an example of an open and a closed chain. It can be 
seen that any 2-flexibility design, where each product/plant node is incident to 
two arcs, is the union of a number of closed chains. 

The results presented in this chapter are motivated by a few observations made 
in the literature regarding the effectiveness of the long-chain flexibility design. The 
first is an observation that has been made in (Graves 2008 and Hopp et al. 2004) 
regarding the performance of the long chain for a balanced system when product 
demands are independent and identically distributed (iid). The observation states 
that if one starts with a dedicated design and adds arcs to create the long chain, 
the incremental benefits , or the change in performance, associated with each added 
arc is increasing. 

To illustrate this observation, consider an example with six plants and six prod- 
ucts, where the demand for each product is equal to 0.8, 1, or 1.2 with equal 
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FIGURE 13.1. Configurations for flexibility designs 



probabilities, and the capacity of each plant is 1. Then we start with a dedicated 
flexibility design (the dashed arcs in Fig. 13.2a) and add arcs (1,2), (2,3), ..., 
(5,6), and (6, 1) one at a time, until we complete the long chain. Each time we 
add such an arc, we determine the expected sales associated with the resulting de- 
sign at that time. Figure 13.2b displays the performance of the flexibility designs 
at different stages, as well as the incremental benefits when a new arc is added. 
The incremental benefits increase as we add more arcs. The biggest impact, sur- 
prisingly, occurs when we add the last arc and close the long chain. 

This example also illustrates the second observation. The long chain is an ef- 
fective flexibility design; in this case, it achieves the same performance as that 
of full flexibility. Indeed, numerous empirical papers reported that the long-chain 
flexibility design is more effective than other 2-flexibihty designs, and its expected 
sales is almost the same as that of full flexibility. 

To explain the effectiveness of the long chain, we start in Sect. 13.2 by establish- 
ing a supermodularity property and apply it to prove that the incremental benefits 
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FIGURE 13.2. The increase in incremental benefit 
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are increasing as the long chain is constructed. Section 13.3 illustrates that the 
performance of a long chain can be characterized by the difference in the perfor- 
mances of two open chains. This result allows us to show that the long-chain design 
maximizes expected sales among all 2 -flexibility design strategies. In Sect. 13.4, 
we then apply the previous results to compare the performance of the long chain 
to that of full flexibility. Finally, we present some extensions beyond long chains 
and balanced systems in Sect. 13.5. 

13.2 Supermodularity and Incremental Benefits of Long 
Chains 

Consider a balanced manufacturing system of size n facing random demand. In such 
a system, there are n plants, each of which has unit capacity, and n different 
products. We will use the vector D to denote the random demand distribution, 
and d a particular random instance. Since in practice demand is never negative, 
D is assumed to be nonnegative. 

For a balanced system of size n, we say that its demand, D , is exchangeable if 
[D i, . . . , D n \ equals D n ^\ m distribution for any 7 r that is a permutation 

of {1, 2 , . . . , n}. We note that any identically and independently distributed (iid) 
demand is exchangeable, but not all exchangeable demands are iid. 

Next, we define several classes of flexibility designs for balanced manufacturing 
systems. For any integer n > 2, the dedicated design for a balanced system of size 
n, f^ n , is defined as S> n = = 1 , 2 , ...,n}; a long-chain flexibility design 

for a balanced system of size n, ^ n , is defined as = Q) n U {(i,i + 1) | i = 
1,2, . . . , n — l}U{(n,l)}; and the full flexibility design for a balanced system of 
size n, 2 ^ n , is defined as & n = {(i,j) \i,j = 1,2,..., n}. In flexibility designs, we 
refer to an arc (i, i) as a dedicated arc and arc (i,j),i 7 ^ j as a flexible arc. We also 
define open chain 2% as U {(i, i + l)|i = 1, . . . , k — 1}, for any integer 

k > 0 . One can think of 2 % as the open chain that connects plant 1 to product k. 
Note that 2 % is simply % \ {(fc, 1)}. 

Given a random instance of the demand, d, the maximum sales that can be 
achieved by a flexibility design si with arc capacities rq denoted by P(d, si , u), is 
defined as 



P(d, si, u ) = Min 

s.t. 



EIU fij ^ d v V1 < 3 < n, 
Ej=i fij < 1 , VI <i<n, 




fij < Uij, VI <i,j < n, 
fij > 0 

fij = 0 i st. 



It is not difficult to see that this optimization problem is a max-flow problem, 
and as a result, we refer to fij as the flow on arc Note that when the arc 
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capacities are 1, the capacity constraints are redundant. In this case, the above 
optimization problem is referred to as the optimization problem associated with 
P{d,srf), or simply, P(d, when there is no ambiguity. 

Under random demands D, we define the performance , also referred to as exp- 
ected sales, of srf to be E[P(D, g/)\, where E[-] is the expectation of a random 
variable with respect to D. For succinctness, we also use \si] to denote this quantity 
when the random vector D is given. 

13.2.1 Supermodularity in Arc Capacities 

In this subsection, we show that P(d, srf , u ) is supermodular in the capacities of 
flexible arcs in srf . For this purpose, we first present a classical result by Gale and 
Politof (1981) on supermodularity of the maximum- weight circulation problem. We 
then note that P(d, srf , u) can be computed by solving an equivalent maximum- 
weight circulation problem, and the result by Gale and Politof (1981) applies. 

In a maximum- weight circulation problem, we are given a directed graph G and 
for each arc 7 , a weight w 1 , a lower bound Z 7 , and an upper bound u 7 on the arc 
flow. A flow / is called a circulation if it satisfies the flow-balance constraints; that 
is, at each node, the inflow equals the outflow. The maximum-weight circulation 
problem is to find a feasible circulation / satisfying the lower and upper bound 
constraints such that the total weight w 1 f 1 is maximized. For a set of arcs 
S and any vector £ indexed by the arcs, let £ s = (£ 7 ) 7 es, a vector consisting of 
components of £ with indices in S. The following theorem from Gale and Politof 
(1981) illustrates that the optimal objective value of the problem, denoted as 
F(w,l,u), is supermodular in the capacities of arcs that are in series (pairwise). 
Note that two arcs a, /? are said to be in series if, for any (undirected) cycle C 
containing both a and /3, a and /? have the same direction. 

Theorem 13.2.1 If S contains only arcs that are in series (pairwise), F(w,l,u ) 
is supermodular in us . 

Proof. In view of Theorem 2.2.2, we only need to show that for any two arcs a and 
/ 3 in S, F(w,l,u) has increasing differences in (u a ,up) when other components 
of u are fixed. Given two capacity vectors u and u' with uq = u[- for all (i,j) 
except a and /3, assume without loss of generality that u a > u' a and up < uC Let 
/ and /' be the optimal circulations in the graph given in the maximum-weight 
circulation problem with capacities u and u', respectively. It suffices to construct 
two circulations g and g' such that g and g' are feasible for the maximum-weight 
circulation problems with capacities u A u f and u\f u', respectively, and 

£ + </ = / + /'• 

If / < u A u' , or /' < u A u f , simply let g = f and g' = or g = /' and g' = f. 
Assume that u' a < f a < u a and up < fp < u'p. Let £ = / — /'. Clearly, £ is a 
circulation with £ a > 0 and £^ < 0. We now show that £ can be decomposed as 
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the summation of several conformal circuits. That is, there exist simple cycles Ci 
and scalars ti > 0 for l = 1 , . . . , r for some integer r such that 

r 

£ = $>/', ( 13 . 1 ) 

1 = 1 

where = sign (£ 7 ) if 7 G C| (regardless of direction) and zero otherwise. To see 
this, start from any node, say node io, incident to an arc (io,ii) with £7^ 7^ 0. 
Without loss of generality, assume > 0 ; otherwise, consider — £. Since £ is a 
circulation, node i\ must be incident to an arc (zi , Z2) or (^2,^1) for some node 
12 such that either £ ili2 > 0 or £ i2il < 0 . Similarly, node i<i must be incident to 
an arc (22,^3) or (i 3 ,i 2 ) for some node is such that either £^ 3 > 0 or £* 3 * 2 < 0. 
Continue the process until the first time we meet a node that has appeared before, 
say node i v . Let i K be the node visited immediately before the second time we 
visit i v . In this case, we end up with a simple cycle i u i u + 1 * * ■ iniv, denoted by C\. 
Let Ci and C-f be the sets of arcs on C\ in the same direction as C\ and in the 
opposite direction as Ci, respectively. Let 

ti = mm |£ 7 |, 

7GCi 

and fj = 1 for 7 G Ci and —1 for 7 G C£* Clearly, f 1 is a circulation with 
f 1 = sign(£ 7 ) for 7 G C\ and zero otherwise. Our construction implies that the 
circulation £ — tif 1 has at least one more arc with zero flow than the circulation 
£. By repeating the above construction, we can show that ( 13 . 1 ) holds. 

Note that for any l = 1 , . . . , r, takes values 0 or 1 because f a > f' a and 
takes values 0 or —1 because fp < fp- In addition, since a and /3 are in series, it 
is impossible to have f& = 1 and f l p = ~ 1 simultaneously for any cycle f l . 

Define new circulations 



9 = f - Y ti f t 9' = f + Y ti f l - 

l=1:r ifL = 1 l=l:rJl=% 



We have that g + g r = f + /' and 



9 = f+ Y tif 1 ^ 9 r = f - Y tl f l - 

/= 1 : t ,/ q = 0 l=l\T,f^=0 



We claim that g and g' are feasible for the maximum- weight circulation problems 
with capacities u A u' and u\/ uf respectively. Notice that for any arc 7, since 
has the same sign as / 7 — f if 7 G C7, we have that 



Y ti A e [ min ( 7 > 7 )’ max ( 7 ’ 7)1 • 

l=l-rj l a = l 



Similarly, 

7 S [min(/ 7 , 7)> max(/ 7 , 7)]- 
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Thus, for any arc 7 different from a and /?, 



Q'y C [^7 5 ^7] • 

For the arc a, since = 0 for a 0 C/, 

9a = fa e [la,ram(u a ,u' a )], g’ a = fa & [l a ,m&x(u a ,u' a )\. 
For the arc /?, since fj’ x = 1 implies that f l p = 0, we have that 



90 = fp e [^,min(ug,7)], 9p ~ f'p ^ [lp,max.(up,u'p)\. 

Thus, g and g' are feasible for the maximum- weight circulation problems with 
capacities u A u' and u\l u ' , respectively. I 

To apply the above theorem, we convert the optimization problem of computing 
P(d, iz) to an equivalent maximum- weight circulation problem. Specifically, let 
G(sY) be the underlying graph, which contains si, an additional node s, an arc 
from s to each of the plant nodes, and an arc from each of the product nodes to s. 
Set the weight of each plant to product arc (that is, the arcs in sY) to 1 and the 
weight of every other arc to zero. The upper bound (capacity) on the flow on an 
arc from s to plant i is set to be 1 for all i = 1,2 , . . . , n; the upper bound for the 
flow on an arc connecting product j to s is set to be dj for all j = 1, 2, . . . , n; and 
the upper bound for the flow on every arc (i,j) G si is set to be Uij. Finally, we 
set the lower bound for the flow on every arc in G(&/) to be 0. It is straightforward 
to show that P(d, u) can be computed by identifying a circulation satisfying 
the lower and upper bounds on flows with a maximum weight. 

The underlying graph of the maximum- weight circulation problem, G(^), is 
illustrated in Fig. 13.3 for srf = ^ 5 , of long chain for a balanced system of size 5. 
Recall that flexible arcs in a long chain of size n are arcs from the set {(i, i + 1) : 
i = 1, 2, . . . , n — 1} U {(n, 1)}. 

Theorem 13.2.2 Let srf be a flexibility design for a balanced system of size n, 
and srf C ^ n . Let S be the set of all flexible arcs in srf . We have that P(d,&/,u) 
is supermodular in us . Hence, for any subsets X,Y of S, 

P(d, fi/\(XnY)) + P(d, g/\(X U Y)) > P(d, fi/\X) + P(d, si \ Y). 

Proof. We first show that flexible arcs are in series (pairwise). Consider any two 
flexible arcs a and /3 and let C be an arbitrary (undirected) simple cycle in G(f£ n ) 
containing both a and /T If C does not contain node 5 , then C must be the 
undirected cycle that contains every plant to product arcs in ^ n . In that case, it 
is easy to verify that a and ft have the same direction in C. Otherwise, suppose C 
contains s. In such a case, C can be decomposed into four pieces, Xi, X 2 , a, and 
/3, where X\, X 2 are the two paths between a and fl. Without loss of generality, we 
assume X\ contains s. Since a and f3 cannot be incident to the same node, both X\ 
and X 2 are nonempty. As X 2 does not contain 5 , all arcs in X 2 are plant-to-product 
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FIGURE 13.3. G(%) for the max-weight circulation associated with P(d^ ni u) 

arcs (i.e. , X 2 C ^ n ). Because of the structure of ^ n , X 2 contains an odd number 
of arcs. Moreover, the path in X 2 U {a} U {/?} has alternating directions for every 
two consecutive arcs and therefore, a and /? have the same direction in C. This is 
illustrated by Fig. 13.4. Since this is true for any arbitrary undirected cycle C, a 
and (3 are in series in G(^ n ). From Theorem 13.2.1, P(d,srf,u) is supermodular 
in us- 




a 



♦ f > X 2 contains 2k — 1 arcs 



FIGURE 13.4. Illustration for the proof of Theorem 13.2.2 



To complete the proof, define u and u' as follows: 




7 G si \ X, 
7 ex, 




1, 7 G £?\Y, 
0, 7 eY. 
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Clearly, 

P(d, £&\X) =P(d,*/,u), 

P(d,srf\Y) = P{d,st,u'\ 

P(d, */\(XnY)) = P(d, u'), 

and 

P(d, i\(!U Y))= P(d, s/,uN u'). 

Thus, the inequality in the theorem holds since P(d, si , u) is supermodular in us 
and X,Y C S. I 

Since the theorem holds for any realization of demand, it must also be true in 
expectation. 

Corollary 13.2.3 For any flexible arcs a, ft in si C E[P(D, si, u)\ is super- 
modular in u a and up for any random distributions D. 

The corollary thus suggests that any two flexible arcs in the long chain comple- 
ment each other. That is, the existence of one flexible arc increases the marginal 
benefit that can be gained when the other flexible arc is added. 

13.2.2 Incremental Benefits in Long Chains 

Corollary 13.2.3 is useful to prove that the incremental benefits associated with 
adding arcs to the long chain is increasing. Consider the following sequence of 
flexibility designs: where we define = 3> n and 

U {{i,i)\i = k + 1 ,...,n}. In words, Jfjjf is simply the open chain 
from plant 1 to product k plus the dedicated arcs connecting plants i to prod- 
ucts i for all k < i < n. One can think of the sequence jSf 2 n , •••, as 

different stages of constructing ^ n , by starting at and adding flex- 

ible arcs (1, 2), (2, 3 ), ... fin — 1, n), (n, 1) sequentially. Finally, recall that is 
the long chain of size n. In the following, we show that the incremental benefit, 
— [2Sf^], is nondecreasing with k. 

Theorem 13.2.4 For any balanced system of size n with exchangeable demand, 
we have 



ra - < ra - ra < • • • < ra - m - m 



Proof. Fix any 1 < k < n — 1. Let a = (1, 2), fl = (fc, k-\- 1). Note that by definition, 

E[P{D,#£ +X )\ = 

and 

E[P(D,S% +1 \ {/?})] = E[P(D,S%)] = m 
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Let D a = [D 2 , P 3 , . . . , D n , Ph]. Observe that 

£7[P(£>, JSf fc " +1 \ {a,m = E[P(D a ,^ +1 \ {a,/3})] = E\P{p,^_ x )\ = [J^] 
and 

E[P{p,3 %+ 1 \ {«})] = = P[P(£>, JSf fc ")] = m 

where the above equalities hold since the random vector D is exchangeable. 

By Corollary 13.2.3, we have 

E[P(D, J^ +1 )] + P[P(I>, J2T +1 \ {a,/?})] > E[P(D,J%+i \ {«})] 

+ E[P(D,^ + 1\{/3})]. 



Thus, [S% +1 ] - [J?£] > [S%] - for k = 2, ,.,n- 1. 

To show [:2f£] - < [*?„] - [22f£], let a = (1,2), /3 = (n, 1) and let 

L^cr = [£> 2 , Ds, • • • , D n , Di]. Then 

P[P(P,^] = K], 
p[p(A^\m)] = ra, 
i5[P(A%\ {<*,£})] = 25[P(Z? ff ,.2£_ 1 )] = 

and 

p[p(p,^ n \M)] = p[p(p a ,^)] = ra. 

Again by Corollary 13.2.3, 

E[P(D, &„)] + P[P(P, % \ {a, /?})] > P[P(P, \ {a})] + E[P(D, \ {/?})] , 

which implies that [%] — [jJf™] — V^n\ ~ \^n- 1 ]- This completes the 
proof. I 

Observe that the proof of Theorem 13.2.4 requires the application of the super- 
modularity result (Theorem 13.2.2), which holds deterministically for any fixed- 
demand instance. By contrast, Theorem 13.2.4 holds only stochastically under 
exchangeable demand but does not hold for any fixed-demand instance. 



13.3 Characterizing the Performance of Long Chains 

In this section, we show that in a balanced system of size n with exchangeable 
demand, the performance of the long chain can be characterized by the differ- 
ence between the performances of two open chains, which allows us to show that 
the long chain is optimal among a class of flexibility designs and develop an effi- 
cient algorithm to compute its expected performance. Like the previous section, 
we start by developing several properties of the long chain when the demand is 
deterministic. 
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13.3.1 Decomposition of a Long Chain 

In this subsection, we fix an arbitrary demand instance d. Throughout the sub- 
section, when some integer k appears in a statement, we are, in fact, referring to 
some i £ {1, ..., n} congruent to k modulo n. For example, if plant n + 3 appears in 
a statement, then we are referring to plant 3; and if / n+ i, n + 2 , the flow from plant 
n + 1 to product n + 2 appears in a statement, then we are referring to /i ? 2 , the 
flow from plant 1 to product 2. 

First, we start with the following lemma. 

Lemma 13.3.1 Suppose P(d,^ n \ {<a}) = P(d,^ n ), where a is a flexible arc in 
^ n . Then, for any set X C S, where S is the set of all flexible arcs in we have 
that 

P(d, ? n \(IU {a})) = P(d, V n \X). 

Proof. If a £ X, the result is trivial aslU{o;} = X. Otherwise, by Theorem 13.2.2, 
P(d, \ (X U {a})) + P(d, V n ) > P(d, \ X) + P(d, % \ {a}), 
which, together with the assumption that P(d, ^ n ) = P(d, ^ n \ {<a}), implies that 
P(d, % \ (X U {a})) > P(d, \ X). 

Since by definition, P(d,^ n \ (X U {a})) < P(d,^ n \ X), the lemma 

holds. I 

Next, we show that the sales associated with can be expressed as a sum of 
n quantities, where each quantity is the difference of the sales associated with two 
open chains in tfn- 

Theorem 13.3.2 For any fixed- demand instance d on balanced system of size n, 
we have 



P{d, Vn) = V n \ {(*, * + 1)}) - P(d, V n \{(i-1, i), (i, i), (i, i + 1)})). 

i= 1 

Proof. Since the demand instance d is fixed, for the sake of succinctness, we use 
P(s/) to denote P(d, si) in the proof. We also define cti = (i, i + 1) and = (i, i) 
for i = 1, 2, . . . , n [note that a n = (n, 1) as n + 1 is congruent with 1 modular n]. 
We first claim that there is some i* such that P(f& n ) = P(^n \ {c^*}). Indeed, 
given an optimal solution of the maximum- flow problem defining P(f£ n ), /*, if 
/*. > 0 for any i — 1, . . . , n, define a new flow / such that 

/a l = /;-^/ft = /; i +^vi = i,...,n, 

where 5 = nmq=i :n /*.. Let /*. # = S. Clearly, / is feasible for the design ^ n \{cb*} 
and generates exactly the same amount of total flow as /* for the design ^ n . 
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Thus, is optimal for the maximum-flow problem defining Pp& n \ {cq*}) and 
P(%? n ) = Pffin \ {oji* }). Without loss of generality, we assume that i* = n, as we 
can always relabel each plant (and product) i by i — i*. 

For each 1 < k\ < < n, define J %’k 1 ^k 2 = {(l OK = &i, &i + 1, • • • , ^ 2 } U { (z, i + 

1 )| i = £q, ki + 1, . . . , &2 — 1}, and for each 1 < k 2 < k\ < n, define J ^i^k 2 = 
{(M)K = A?i, fci + 1 • • • ,n, 1, 2, . . . , /c 2 }u{(i, i + l)|i == Aq, . . . ,n, 1, . . . , k 2 - 1}. One 

can think of J as the open chain connecting plant Aq to product k 2 in the 
balanced system of size n. 

By the definition of a and /3, we have that for i = 1, . . . , n — 1, 

\ W) - P(^n \ {a<- 1, «i, A}) 

= -P(^n \ {tti}) - -PfVn \ {«*-!,«*}) + rnin{l, di} 

= P(ffn \ {oii, a n }) - P(tf n \ {ai-i, a», a n }) + min{l, d t } 

= P{^i) + P(3?(i+ i)-m) + min{l, di} ( 13 - 2 ) 

— (p(«5? i_».(i_i)) + P(3f(i+ i)-m) + min{l, dj}) 

= P(22fU i )-P(jSfU (i _ 1) ), 



where the first equality holds since c^} is the union of two disjoint comp- 

onents {/3i} and ^ n \{^i-i, cq, A}, the second equality follows from Lemma 13.3.1, 
and the third equality holds since \ {cq, a n } is the disjoint union of components 
JK*i -+i and and \ {cq_i, cq, a n } is the disjoint union of 

^{i+i)^n and {fa}. Similarly, 

P(Vn \ - P&n \ {a n , ai, Pi}) 

= P^n \ {«i}) - P&n \ {an, «i}) + min{l, di} n „ oh 

= \{«i,an}) --P(^n \{ai,a n }) +min{l,di} 

= min{l, di}, 

where the first equality holds since \ {a n ,ai} is the union of two disjoint 
components {Pi} and c 6. n \ {a n , cq, Pi} and the second inequality follows from 
Lemma 13.3.1. 

Finally, 

P(?n \ M) - P(^n \ {«„-!, «n, Pn}) = P(#l-+n) ~ P(& U(n-l))- 

(13.4) 



Now, applying Equations (13.2)-(13.4), we obtain that 

E?=l (P(^n \ {«i}) - P(^n \ {«<-!, «i, Pi})) 

= minlMi} + ~ P(JS?i_(i-i))) 

= min{l, di} + P( 2 z? i_>„) - P( 2 z?i_a) 

= P(2SfU„) 

= P(tf„ \ {a„}) 

= P(tf»). 



This completes the proof. 
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We note that \ {(i, i + 1)} is an open chain connecting plant i + 1 to product 
i, while ffn \ {(i — 1, i), (i, i), (i, i + 1)} is an open chain connecting plant i + 1 to 
product i — 1. 

13.3.2 Characterization and Optimality 

With Theorem 13.3.2, we can now characterize the performance of the long chain 
using the performances of open chains. 

Theorem 13.3.3 For any balanced system of size n with exchangeable demand 
D, we have 

[V n \ = n([J? n MJSfn-r]). 



Proof. Theorem 13.3.2 states that for any d that is an instance of D, 

n 

P(d,V n ) = 5^(P(d,«’ n \{(*,i+l)})-P(d, < «f n \{(i-l, *),(*,*),(*, *+l)})). (13.5) 

i= 1 

Since D is exchangeable, for any 1 < i < n, 

E[P(D,V n \{(i,i + 1)})] = 

E[P(D, V n \{(i- 1, *), (*, *), (*, * + 1)})] = 

Thus, the theorem follows by integrating over all random instances of D on 
Equation (13.5). I 

Theorem 13.3.3 provides insights on the performance of long chains. Indeed, 
it relates the expected performance of a long chain, [%f n ], with the difference in 
the expected performances of two open chains, [<Sf n ] and [2Sf n _i], which are much 
easier to compute and analyze. 

An immediate corollary of Theorem 13.3.3 is that the long chain is optimal 
among all 2-flexibility designs. 

Corollary 13.3.4 Consider a balanced system of size n with exchangeable 
demand. Let F 2 be the set of all 2- flexibility designs of the system. That is, F 2 
is the set of all flexibility designs where each plant node and each product node are 
incident to exactly two arcs. Then we have 

Y&n] = arg max 
^gf 2 

In words, the long chain maximizes expected sales among all 2-flexibility designs 
in the system. 

Proof. Consider a 2-flexibility design rtf E F 2 . si must consist of several closed 
chains (i.e. , induced subgraphs in srf that form undirected cycles) denoted by 
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SCi, SC 2 , SCk- Let rii be the number of products and plants in the closed 
chain SCi. Since the system size is n, Yli=i n i = n * Now, by Theorem 13.3.3, we 
have 

M = £i=i rii([S? ni ] - [^m-l]) 

= - K”-i] +£?[min{l,I>i}]) 

< Ei=i«i(Kl - +£[min{l, £>!}]) 

= 2|=1 — [«5?n— l]) 

= n([Jf n \ - [Jfn-l]) 

= [Vn], 

where the second and third equalities follow from the definition of , and the 
inequality from Theorem 13.2.4. I 

13.3.3 Computing the Performance of a Long Chain 

In this subsection, we present a method to compute [^ n ], the performance of 
the long chain, based on Theorem 13.3.2. We focus on balanced systems with iid 
demand. Because we consider systems of arbitrary sizes, we let D be an infinite 
random vector with iid entries, where Di is the random demand for product i 
generated by a given distribution D for alii > 1. 

Since is the difference of [jJf n ] and [j£f n _i] by Theorem 13.3.3, we first 
introduce a greedy algorithm that finds the optimal solution of the linear program 
associated with P(d, Jjf n ), where d is an instance of D. 

Finding the Optimal Solution /* for P(d, Jf n ) 

Step 1: Set ff ± = min(l,di). 

Step 2: For k = 2 to n, 

set fk-i,k = min { 1 - ifc-i.fc-iVfc} and fl k = min{l,4 - f k _ hk }. 



Showing that /* is optimal for the open chain Jzf n is straightforward and is left 
as an exercise. 

Given a random demand vector D, let Fij be the random flow on arc (i, j) 
returned by the above algorithm, for 1 < i,j < n. For each integer 1 < k < n — 1, 
define Wk = 1 — F^k an( i Wo = 0. Wk can be thought of as the remaining capacity 
in plant k after the production of product k at plant k is determined. 

To develop a method to compute the performance of the long chain, assume that 
the support of D lies in {-^ \i = 0, 1, 2, . . .} for some N >1. Under this assumption, 
we let pi = Pr(D = -^), for any i = 0, 1, ... , 2 N — 1, and p 2 N = Pr(P > 2), where 
Pr(-) denotes the probability mass function of D. 

Since the support of D lies in |z = 0,1,2,...}, it is easy to see that the 
support of Fkk hes in {jj\i = 0, 1, 2, ... , N}. Since Wk = 1 — Fkk, the support set 
of Wk is also = 0,1,2,..., TV}. As a result, the distribution of Wk can be 
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described by a row vector q k with N + 1 elements, where its ith component q k 
equals Pr (Wk = ^), for i = 0, 1, . . . , N. The following lemma illustrates how q k 
can be computed. 



Lemma 13.3.5 q k+1 = q k A = q° A k+ 



v— '2 AT 
2-^i=NPi 

sr~^2N 

l^i=N+ lPi 


PN-1 

Pn 


P2N-1 +P2N 


P2N-2 


P2N 


P2N-1 



for 0 < k < n — 1, where = [10 0 ... 0] 



^3 

1 

to 


Pi 


P 0 


Pat-i 


P2 


Po+Pl 


P2N-3 • 


PN 


v-^\ AT — 1 

Z^i=o ^ 


P2N-2 * 


• * Pat+i 


v--\ AT 



Proof. Since Wo is 0 by definition, q° = [1 0 0 . . . 0]. Because the demand is inde- 
pendent and Wk only depends on Hi, . . . , D&, Wk is independent of Dk+i . Hence, 
we have for 1 < i < N — 1, 



f ' \ N f . \ N 

<li +l =Pr( w k+l = ^ ) = E Pr (^ = ^') Pr ( D ^+l = 1 - Jf + ) = E • 

' ^ i=o ^ j = o 

In addition, 

AT , . x AT 2AT 

Q 0 fc+1 = Pr(VF fc+ i = 0) = ^ Pr (W k = j ) Pr ( D k+1 > 1 + -L ) = ^ q ) ^ pjv+i 

j=0 ' ' j = 0 Z=AT+j 

and 

N / , N j 

q k N l = Pr(w fc+1 = 1) = 52 Pr (^ = i) pr ( ^+1 < ^ ) = E Z>- 

j=0 ' ' j=0 z=o 



Thus, g /c+1 = I 

A direct consequence of Lemma 13.3.5 is that the following matrix multiplica- 
tions can be used to determine the performance of the long chain when demands 
are iid and the support of a product demand is a subset of{^|z = 0,l,2,...}. 

Theorem 13.3.6 = [Sf n \ ~ [«5f n - 1 ] = g n_1 ? r = g°A n_1 7r, where n is a vector 

of size N + 1, with 

N+i 2N 

7Ti = 52 JPj + {N + i) 52 Pv v 0 < i < N. 

j= 1 j=AT+i+l 

Proof. The algorithm described above implies that [jJf n ] — [jJf n _i] can be written as 
the expectation of F n _i n +F nn , which is equal to ^[minjl-hlTn-i, D n }]. Moreover, 
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E[mm{l + W n - U D n }\ = p r (W„_i = i)E[mm{D n , 1 + ±}) 

= Ef = 0 ^(EjLYiPi + (Jv + i) E?=jv+i+iPj)- 

Hence, we have that [Sf n ] — [Jf n -i] = g n_1 7r. Apply Theorem 13.3.3, and we are 
done. I 

The matrix multiplication method developed here to compute the performance 
of the long chain is polynomial in N and n. Indeed, computing q°A n ~ 1 tt re- 
quires 0(nN 2 ) operations if one sequentially evaluates q° A 1 for i = 1 or 

0(N 2 - 807 logn) operations if one starts by determining A n_1 using the classical 
algorithm from Strassen (1969). 

The matrix multiplication method can be used to compute the per-product 
performance of the long chain for an infinite-size system. Observe that the matrix 
A is the transition matrix of a Markov chain with states for each i = 0, 1, . . . , N. 
Since we focus on a balanced system with E[D\ = 1, we can show that the states 
in the Markov chain can be partitioned into two sets, a set containing states, 
including state 0, that communicate with each other and the other set containing 
inessential states. It is well known that hm n ^oo q°A n ~ 1 = g*, where q* A = g*, 
#o > 0, and q j = 0 for any inessential states j. Thus, to compute lim n ^ 00 
one can solve for g* by finding the eigenvectors of A and then compute g*7r, which 
gives limn-^oo 

We can apply the matrix multiplication method for general iid demands as an 
approximation algorithm to compute the performance of long chains. In this case, 
one can approximate the performance of the long chain by discretizing the demand 
distribution on the set of = 0, 1, 2, . . .} for some integer N. Clearly, as N 

increases, the error of the approximation decreases while the running time grows. 
Specifically, it is straightforward to show that the error of the approximation is 
bounded by ^ . Interestingly, computational experience suggests that the error is 
much smaller than this bound. 

Moreover, the matrix multiplication method is fairly fast even for large N . For 
example, when N = 1000 and n = 100, q°A n ~ 1 tt can be computed within 2 s using 
Matlab on a standard 2.1GHz laptop. Hence, even for general iid demands, the 
matrix multiplication method can quickly approximate the performance of a large 
long chain very accurately. 

Figure 13.5 presents computational results obtained using the matrix multipli- 
cation method for three different iid demand distributions: 

• Normal: Demand for a product is a discretized normal random variable 
with mean 1 and standard deviation of 0.33 on the support set of {j±\i = 
0,1,..., 28}; 

• Uniform: Demand for a product is uniformly distributed on the set {^|i = 
0,1,2, ...,9,11, 12,. ..,20}; 

• Asymmetric: Demand for a product is equal to | with probability 0.4, 1 
with probability 0.5, and 2 with probability 0.1. 
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For each distribution, Fig. 13.5 depicts (the per-product performance of 
full flexibility), (the per-product performance of the long chain), and 
(the ratio between the performance of the long chain and the performance of full 
flexibility design) for n = 1, . . . , 30. 



Normal 




Asymmetric 




Uniform 




Full Flex 

Long Chain 

Long Chain / Full Flex 



FIGURE 13.5. The performance of long chains vs. the performance of full flexibility 



Figure 13.5 reveals several interesting observations. First, that is, 

the gap between the fill rates of full flexibility and the long chain, is increasing, 
while the ratio is decreasing. In addition, Fig. 13.5 suggests that the quantity 

the fill rate of the long chain, is increasing but converges to a constant very 
quickly. 



13.4 Performance of Long Chains 

In this section, we present several analytical results that provide justifications to 
the observations in Fig. 13.5. The following theorem illustrates that the quantity 
— ] ^n L indeed nondecreasing with n. Its proof is a bit involved and hence 
omitted. 



258 



13. Process Flexibility 



Theorem 13.4.1 For any integer n> 2 and iid demand, 



n ] [^n] < [^n+ 1 ] 

n n ~ n + 1 



[^n+l] 

n + 1 



< min{l, E[D]} — 7 , 



w/iere 7 = lim*,-**, ^ . 

Note that the fill rate of '£ n (and & n ) is equal to (and ) ■ Thus, The- 

orem 13.4.1 implies that the smaller the system size, the smaller the gap between 
the fill rate of full flexibility and that of the long chain. This suggests that the 
long chain is more effective relative to full flexibility for smaller systems. 

Moreover, Theorem 13.4.1 can be used to bound the gap between the fill rate 
of full flexibility and that of the long chain for systems of any size. Since for 
many iid demand with E[D\ = 1, 7 is close to 1 , as shown in Chou et al. (2010), 
Theorem 13.4.1 implies that for any size system, the performance of the long chain 
is close to that of full flexibility. For example, when D is normal with mean 1 and 
standard deviation 0.33, 7 = 0.96. Therefore, for this demand distribution, we 
have that the gap between the fill rate of full flexibility and that of the long chain 
for systems of any size is at most 4%. 

Though the ratio of the performance of long chain to that of full flexibility, 
||rj > observed to be nonincreasing empirically in Chou et al. (2008), 

whether it can be proven analytically remains an open question. Of course, if 
> [% n+1 \ indeed holds, it follows that 

n \ ySr n+lj 



[tfn 



> lim 



[tfk]/k 



= lim 



k^oo [P k \/k fc-s>oo min{F[P],l}’ 



Vn > 2, 



(13.6) 



where, as before, 7 = lim^oo and Hindoo = min {E[D\, 1} by the weak 
law of large numbers. This would provide a lower bound on the ratio of the per- 
formance of the long chain to that of full flexibility for any system size. 

A slightly weaker lower bound of the ratio when E[D\ = 1 is provided by Simchi- 
Levi et al. ( 2012 ). 



Corollary 13.4.2 Suppose demand is iid and E[D ] = 1; then 

[Vnl > 1 _ ( 1 - 7 )^ 

[&n] ~ [&n] ’ 

where 7 = Hindoo ^ . 



Proof. By Theorem 13.4.1, 



Thus, 



Wi i 


_M< 


[&+ 1 ] 


&i+ 1] 


i 


i 


i + 1 


i + 1 


m 


\&i+ 1 ] 


< M 


l^i+l] 


i 


* + 1 


— i 


i+ 1 



Vi > 1. 



Vi > 1. 



(13.7) 
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Now, adding inequality (13.7) for all i > n, we have that 






— lim 



■k — yoo 



I£*l 

k 



< 



M » 



— lim, 



k—too 



l&l 

k ' 



(13.8) 



But 



lim 

k — Yoo 




= 1 , 



lim 

k — yoo k 



= 7, 



and substituting those into inequality (13.8), we have 




n n 



which leads to the inequality in the statement of this corollary. I 

To explain the power of the lower bound in Corollary 13.4.2, let S n = — 1, 

which implies that 1 — = 7 — 5 n (l — 7 ). Since is nondecreasing with n 

(readers are asked to prove this claim in the exercises), S n is nonincreasing. Thus, 
if 5k ~ 0 for some small integer &, then Corollary 13.4.2 provides a lower bound for 
that is close to 7 for all n > k. Indeed, for many distributions with E[D] = 1, 
5k ~ 0 for small k. For example, suppose the distribution of D is normal with 
mean 1 and standard deviation 0.33; then = 1.08, which implies £3 = 0.08. 
Since 7 = 0.96, by applying Corollary 13.4.2, we have that 



[&n] 



Wn] 



> 7 - 63(1 - 7 ) = 0.96 - 0.04 x 0.08 = 0.9568, 



Vn > 3. 



That is, when demand is normal with mean 1 and standard deviation 0.33, the 
long chain of any size greater than 2 achieves at least 95.68% of the performance 
of full flexibility. 

Finally, we focus on the per-product performance (and fill rate, which is linearly 
proportional to the per-product performance) of the long chain as a function of 
system size. We start by showing that is nondecreasing with n under iid 
demand. 

Theorem 13.4.3 Under iid demand D, we have , for any integer 

n > 2 . 



Proof. Since D is iid, the first n (and n + 1) entries in D are exchangeable for a 
balanced system of size n (and n + 1). Thus, by Theorem 13.2.4, we have that 



m - [^-1] < Ki 1 ] - [^c +1 ], 



which is equivalent to 

[&n\ ~ l&n-i] - E[mm{D, 1}] < [J2? n+1 ] - [& n ] - E[mm{D, 1}] 
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and hence implies that 



\&n\ - [Sfn-l] < [jSfn+l] ~ \&n\. 



Applying Theorem 13.3.3 completes the proof. I 

The theorem thus states that , as well as the fill rate associated with a long 
chain, increases with the number of products, n. This phenomenon is analogous 
to the classical “risk-pooling” effect associated with demand aggregation, except 
that here we aggregate across capacities. 

Interestingly, Fig. 13.5 suggests that the fill rate of the long chain quickly con- 
verges to a constant. This is shown in the next theorem, where we prove that the 
convergence rate is exponential for arbitrary iid, nondegenerate demands. 

Theorem 13.4.4 When demands are iid and nondegenerate , there exist constants 
c > 0 and K > 0 such that 



n - hi 



[Vn 



n 



< Ke~ 



for any n > 2. 



Proof. From the definition of W n _i, we have that = E[mm{l-\-W n -i(D) , D n }]. 
Let D 2 = [Z> 2 , Ds, • • .]. We have 

§fi - ^ = E[ min{l + W n (D),D n+1 }}-E[mm{l + W n _i(D 2 ), D n+1 }} 

= E[ min{l + W n (D), D n+l } - min{l + W n -i(D 2 ), D n+l }} 

< Pr (W n {D) ± Wn-^D 2 )), 

where the last inequality is true because 

min{l + W n (D),D n+1 } - min{l + W n _i(D 2 ), D n+1 } 

never exceeds 1. Note that for any particular random instance d, W n (d) = W n -\ (d 2 ) 
if Wi{d) = 0 for some 1 < i < n or Wi(d 2 ) = 1 for some 1 < i < n — 1. Thus, 

Pr (W n (D) ± Wn-^D 2 )) < Pr (W^D) > 0, W^D 2 ) < 1, VI < i < n). 

Therefore, we have 

i^+A _ Ml < p r (Wi(D) > 0, Wi(D 2 ) < 1, VI < i < n). 
n + 1 n 

Now, since D is nondegenerate and iid, there exists some t such that 

t 

p = Pr (%2(Dj - 1) > 1) > 0. 

3 = i 
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If some instance d satisfies the condition 

Wi(d) > 0 ,Wi(d 2 ) < 1,V1 < * < n, 

then we must have that 

kt+i 

~ !) < 1 

j= 2+(fe— l)t 

for any 1 < k < Hence, 

_ m < p r ( Wi ( D ) > 0, Wi(D 2 ) < 1, VI < i < n) 

< Pr (EK - 1) < 1, VI < k < L^j) 

= ( 1-P)L^J 

< Ke~ cn 

for some constants IT > 0 and c > 0. I 

Figure 13.5 and Theorem 13.4.4 show that ~ f or any £ provided 

that n is large. Hence, it implies that in a system with a large number of plants 
and products, it is not necessary to have a long-chain design that connects all 
the plants and products. A collection of several chains, each of which has a large 
number of plants and products, can be as effective. 



13.5 Extensions 

In the previous section, we focused on long chains for balanced systems and were 
mainly concerned with the average performance. In addition, we assumed that 
production quantities are decided only after demand is realized. Various extensions 
can be found in Chou et al. (2010, 2011, 2012). In Chou et al. (2010), in addition to 
analyzing the asymptotic average performance of long chains, the authors analyze 
a system with general demand and supply. They show using random sampling that 
there exists a sparse flexibility structure that achieves a performance nearly as well 
as the full flexibility structure on average. Specifically, in a manufacturing system 
with n plants and m products, there exists a flexible design si with O ( ^ nJr ^ u ^j 
links such that E[P(D,s/)\ > (1 — e)E[P(D, where U = max and L = 

min Epr* 

Since the random sampling approach reveals very few insights regarding the 
flexibility structure, Chou et al. (2011) show that the so-called graph expander 
structure, a sparse but highly connected graph, often used in communication net- 
works, can achieve a performance nearly as good as the full flexibility structure 
in the worst case. For a balanced system, for any e G (0, 1), there exists a graph 
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expander si with no more than A arcs incident to each plant node such that 
P(d, si) > (1 — e)P(d, &) for any instance d, as long as 

A > 1 + log 2 U+{U + 1) log 2 e | v ! x 

-l°g 2 (l-e) 

The authors extend the concept of graph expander to derive a similar result for 
general systems. 

Finally, Chou et al. (2012) analyze a setting in which only a portion of the pro- 
duction can be postponed until the realization of demand. They show that in this 
case, though the performance of long chains deteriorates when the postponement 
level is moderate, a sparse structure with a small amount of additional flexibility 
can performnearly as well as the full flexibility structure. 



13.6 Exercises 



Exercise 13.1. (Murota and Shioura 2005) Consider the maximum- weight circu- 
lation problem described in Sect. 13.2. Let S be an arc set consisting of arcs that 
are in series (pairwise). Show that F(w,l,u) is M^-convex in ws, and L^-concave 
in Is and in us- 

Exercise 13.2. (Murota and Shioura 2005) In a directed graph, two arcs <a, /3 
are said to be in parallel if any (undirected) cycle C containing both a and /3 
orients them in the opposite direction. Let P be an arc set consisting of arcs that 
are in parallel (pairwise). Show that F(w, l,u) is L^-convex in wp , M^-concave in 
lp and in up. 

Exercise 13.3. Show that /* computed in the algorithm in Sect. 13.3.3 is indeed 
optimal for P(d, Jjf n ). 

Exercise 13.4. Show that is nondecreasing in n. Note that the average sales 
of the full flexibility design, [J^ n ], is given by P[min{n, XlILi £*«}]• 
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Supply Chain Planning Models 



14.1 Introduction 

In the last decade, many companies have recognized that important cost savings 
and improved service levels can be achieved by effectively integrating produc- 
tion plans, inventory control, and transportation policies throughout their supply 
chains. The focus of this chapter is on planning models that integrate decisions 
across the supply chain for companies that rely on third-party carriers. These mod- 
els are motivated in part by the great development and growth of many competing 
transportation modes, mainly as a consequence of deregulation of the transporta- 
tion industry. This has led to a significant decrease in transportation costs charged 
by third-party distributors and, therefore, to an ever-growing number of companies 
that rely on third-party carriers for the transportation of their goods. 

One important mode of transportation used in the retail, grocery, and elec- 
tronic industries is the less-than-truckload (LTL) mode, which is attractive when 
shipment sizes are considerably less than truck capacity. Typically, LTL carriers 
offer volume, or quantity, discounts to their clients to encourage demand for larger, 
more profitable shipments. In this chapter, we model these discounts as a piecewise 
linear concave function of the quantity shipped. 

Similarly, production costs can often be approximated by piecewise linear and 
concave functions of the quantity produced, that is, setup plus linear manufactur- 
ing costs. These economies of scale motivate the shipper to coordinate the produc- 
tion, routing, and timing of shipments over the transportation network to minimize 
systemwide costs. In what follows, we refer to this problem as the shipper problem. 

D. Simchi-Levi et al., The Logic of Logistics: Theory, Algorithms, and Applications 263 
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This planning model, while quite general, is based on several assumptions that 
are consistent with the view of modern logistics networks. Indeed, the model deals 
with situations in which all facilities are part of the same logistics network, and 
information is available to a central decision maker whose objective is to optimize 
the entire system. Thus, distribution problems in the retail and grocery indus- 
tries are special cases of our model where the logistics network does not include 
manufacturing facilities. 

The model also applies to situations in which suppliers and retailers are en- 
gaged in strategic partnering. For instance, in a vendor-managed inventory (VMI) 
partnership, point-of-sales data are transmitted to the supplier, which is responsi- 
ble for the coordination of production and distribution, including managing retail 
inventory and shipment schedules. Hence, in this case, the model includes manu- 
facturing facilities, warehouses, and retail outlets. 

This deterministic tactical model is motivated, in part, by our experience with 
a number of companies that apply similar models on a rolling- horizon basis. That 
is, they consider forecast demand for the next 52 weeks and allow the model 
to generate a production, transportation, and inventory schedule for the entire 
planning horizon. The use of a rolling horizon implies that these companies employ 
the plan generated by the model only for a few time periods, say for the first three 
or four weeks. As time goes on, they update the demand forecasts and run the 
model again. 

While this model is deterministic, in practice, safety stocks are determined ex- 
ogenously and incorporated into the minimum inventory level that should be main- 
tained at the beginning of each period. Of course, an important question when 
managing inventory in a complex supply chain is where to keep safety stock? The 
answer to this question clearly depends on the desired level of service, the logis- 
tic network, the demand forecast and forecast error, as well as lead times and 
lead-time variability. Thus, in Sect. 14.3, we discuss models for positioning and 
optimizing safety stock in the supply chain. We start in the next section with our 
modeling approach and results for the shipper problem. 



14.2 The Shipper Problem 

In this section, we focus on the shipper problem under piecewise linear and con- 
cave production and transportation costs, and use properties resulting from the 
concavity of the cost function to devise an efficient algorithm. 

The objective of the shipper is to find a production plan, an inventory policy, 
and a routing strategy to minimize the total cost and satisfy all the demands. 
Backlogging of demands may be allowed, incurring a known penalty cost, which 
is a function of the length of the shortage period and the level of shortage. In 
this case, four different costs must be balanced to obtain an overall optimal pol- 
icy: production costs; LTL shipping charges; holding costs incurred when carrying 
inventory at some facility; and penalty costs for delayed deliveries. 
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To formulate this tactical problem, we first incorporate the time dimension into 
the model by constructing the so-called expanded network. This expanded net- 
work is used to formulate the shipper problem as a set-partitioning problem. The 
formulation is found to have surprising properties, which are used to develop an 
efficient algorithm and to show that the linear programming relaxation of the 
set-partitioning formulation is tight in certain special cases (Sect. 14.2.4). Compu- 
tational results, demonstrating the performance of the algorithm on a set of test 
problems, are reported in Sect. 14.2.5. 



14-2.1 The Shipper Model 

Consider a generic transportation network, G = (TV, A), with a set of nodes N 
representing the suppliers, warehouses, and customers. Customer demands for the 
next T periods are assumed to be deterministic, and each of them is considered a 
separate commodity, characterized by its origin, destination, size, and the time 
period when it is demanded. Our problem is to plan production and route ship- 
ments over time to satisfy these demands while minimizing the total production, 
shipping, inventory, and penalty costs. 

A standard technique to efficiently incorporate the time dimension into the 
model is to construct the following expanded network. Let ti,72, . . . ,tt be an 
enumeration of the model’s relevant time periods. In the original network, G, each 
node i is replaced by a set of nodes GG 2 , • • • St- We connect node i u with node 
j v if and only if r v — r u is exactly the time it takes to travel from i to j. Thus, arc 
in jv represents freight being carried from i to j starting at time r u and ending 
at time r v . We call such arcs shipping links. In order to account for penalties 
associated with delayed shipments, a new node is created for each commodity 
and serves as its ultimate sink. For a given commodity, a link between a node 
representing its associated retailer at a specific time period and its corresponding 
sink node represents the penalty cost of delivering a specific shipment in that time 
period; it is called the penalty link. Similarly, to include production decisions in 
the network model, we add for each node i t , corresponding to a production facility 
(supplier) i at a particular point in time £, a dummy node i' t and an arc from 
i[ to i t whose cost represents the piecewise linear concave manufacturing costs. 
Observe that these production links have the same cost structure as the shipping 
links. Consequently, in our analysis of the network model, we will include them 
in the set of shipping links. Finally, we add links (q, ii+i) for l = 1, 2, . . . , T — 1, 
referred to as inventory links. 

Let Gt = (V,E) be the expanded network. Figure 14.1 illustrates the ex- 
panded network for a simple scenario where the shipping and inventory costs have 
to be balanced over a time horizon of just three periods and shortages are not 
allowed. For simplicity, we assume that travel times are zero. 

Observe that, using the expanded network, we can formulate the shipper prob- 
lem as a concave-cost multicommodity network flow problem. 
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Simple Scenario 




Associated Expanded Network 



Period 1 



Period 2 



Period 3 




FIGURE 14.1. Example of expanded network 

14 - 2.2 A Set- Partitioning Approach 

To describe our modeling approach, we introduce the following notation. Let /C = 
{1, 2, . . . , iU} be the index set of all commodities, or different demands with fixed 
origin and destination, and let k = 1,2, be their corresponding size. 

For instance, commodity k = 1 may correspond to a demand of w\ = 100 units 
that need to be shipped from a certain supplier to a certain retailer and must 
arrive by a particular period of time or incur delay penalties. Let the set of all 
possible paths for commodity k be P&, and let c p k be the sum of inventory and 
penalty costs incurred when commodity k is shipped along path p £ P& . Observe 
that the shipping cost associated with a path will depend on the total quantity 
of all commodities being sent along each of its shipping links and, consequently, 
it can’t be added to the path cost a priori. Thus, each shipping edge, whose cost 
must be globally computed, needs to be considered separately. Let the set of all 
shipping edges be SE, and for each edge e G SE, let z e be the total sum of weight 
of the commodities traveling on that edge. 

We assume that the cost of a shipping edge e, e £ SE , of the expanded net- 
work Gt(V,E), is F e (z e ), a piecewise linear and concave cost function that 
is nondecreasing in the total quantity, z e , of the commodities sharing edge e. As 
presented in Balakrishnan and Graves (1989), this special cost structure allows 
for a formulation of the problem as a mixed integer linear program. For this pur- 
pose, the piecewise linear concave functions are modeled as follows. Let R be the 
number of different slopes in the cost function, which we assume, without loss of 
generality, is the same for all edges to avoid cumbersome notation. Let MJ _1 , MJ, 
r = 1, . . . ,i?, denote the lower and upper limits, respectively, on the interval of 
quantities corresponding to the rth slope of the cost function associated with edge 
e. Note that M® = 0 and M ^ can be set to the total quantity of all commodities 
that may use arc e. We associate with each of these intervals, say r, a variable cost 
per unit, denoted by c^, equal to the slope of the corresponding line segment, and 
a fixed cost, /J, defined as the ^/-intercept of the linear prolongation of that seg- 
ment. See Fig. 14.2 for a graphical representation. Observe that the cost incurred 
by any quantity on a certain range is the sum of its associated fixed cost plus the 
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F e (z e ) 




M e ° M e > 






J e 



FIGURE 14.2. Piecewise linear and concave cost structure 



cost of sending all units at its corresponding linear cost. That is, we can express 
the arc flow cost function, F e [z e ), as 



if z e G Ml]. Clearly, 

Property 14.2.1 The concavity and monotonicity of the function F e imply that 

1. a\ > a 2 e > . . . > of > 0, 



3. F e (z e ) = min r= i 5 ... 5 # |/J + OL r e z e y The minimum is achieved at a unique 
index s, unless z e = Mf, in which case the two consecutive indexes s and 
5 + 1 lead to the same minimum cost. 

We are now ready to introduce an integer linear programming formulation of 
the shipper problem for this special cost structure. Recall that denotes the total 
flow on edge e, and let z e k be the quantity of commodity k that is shipped along 
that edge. For all e G SE and r = 1, . . . , R, define the interval variables, 

r _f 1, if z e e(M r e ~\M r e ], 

X e — \ 

{ 0, otherwise, 

and, in addition, for every fc, fc G /C, let the quantity variables be 




8. 0</i</ e 2 <...</, 



R 




z ek , if z e e(M r e -\M r e ], 
0, otherwise. 
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In order to relate these edge flows to path flows, we define, for each e G SE and 

p e Uf=i Pk, 

^ e f 1, if shipping link e is in path p , 
p \ 0, otherwise. 

Finally, let variables 

f 1, if commodity k follows path p in the optimal solution 
^ pk \ 0, otherwise, 

for each k G 1C and p G Pk. These variables are referred to as path flow variables. 
Observe that defining these variables as binary variables implies that for every 
commodity &, only one of the variables y p k takes a positive value. This reflects a 
common business practice in which each commodity, that is, items originated at 
the same source and destined to the same sink in the expanded network, is shipped 
along a single path. These integrality constraints are, however, not restrictive, as 
pointed out in Property 14.2.2 below, since the problem is uncapacitated and the 
cost functions concave. 

In the set-partitioning formulation of the shipper problem, the objective is to 
select a minimum-cost set of feasible paths. Thus, we formulate the shipper prob- 
lem for piecewise linear concave edge costs as the following mixed integer linear 
program, which we denote by Problem P. 

K R K 

Problem P : Min EE VpkCpk T EE[«+«!£*] 

k=l pePk eeSEr= 1 k=l 

s.t. 

E»pfe = l> Vfc = l,2 (14.1) 

pePk 

R 

E S pVpk w k = E z ek, Ve e SE, k = 1, . . . , K, (14.2) 

pePk r= 1 



C 

oT 

> 

VI 


(14.3) 


K 




Ve G SE, r = 1, . . . ,R, 


(14.4) 


k= 1 




K 




Y, z r ek > M r e ~ l x r e , Ve € SE, r = 1, . . . , R, 


(14.5) 


k=t 




R 




^<<1 Ve G SE, 

ry» ^ 


(14.6) 


y pk e {0, 1}, Vk = l,2,...,K, and p e P k , 


(14.7) 


x r e e {0, 1}, Ve e SE, and r = 1, 2, . . . , R, 


(14.8) 



Z r ek > o, Ve e SE, \/k — 1,2, . . . , K, 



and r = 1, 2, . . . , R. 
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In this formulation, constraints (14.1) ensure that exactly one path is selected 
for each commodity and constraints (14.2) set the total flow on an edge e to be 
equal to the total flow of all the paths that use that edge. Constraints (14.3)-(14.6) 
are used to model the piecewise linear concave function. Constraints (14.3) specify 
that if some commodity k is shipped on edge e using cost index r, the associated 
interval variable, Xg, must be 1. Constraints (14.4) and (14.5) make sure that if 
cost index r is used on edge e, then the total flow on that edge must fall in its 
associated interval, Finally, constraints (14.6) indicate that at most 

one cost range can be selected for each edge. 

Let Z* be the optimal solution to Problem P. Let Zr x and Zr v be the optimal 
solutions to relaxations of Problem P, where the integrality constraints of inter- 
val (x) and path flow (y) variables, respectively, are dropped. A consequence of 
Property 14.2.1 is the following result. 

Property 14.2.2 We have 



Z* = Zr x = z Ry . 

To find a robust and efficient heuristic algorithm for Problem P, we study 
the performance of a relaxation of Problem P that drops integrality and re- 
dundant constraints. Although constraints (14.3) are not required for a correct 
mixed integer programming formulation of the problem, we keep them because 
they significantly improve the performance of the linear programming relaxation 
of Problem P. In fact, Croxton et al. (2003), show that, without them, the lin- 
ear programming relaxation of this model approximates the piecewise linear cost 
functions by their lower convex envelope. Furthermore, keeping these constraints 
makes constraints (14.4)-(14.6) redundant in the correct mixed integer program- 
ming formulation, as a direct consequence of Property 14.2.1, part 3, and in the 
linear programming relaxation of Problem P as well, as Lemma 14.2.3 below shows. 
This will be useful to considerably reduce the size of the formulation of the problem 
while preserving the tightness of its linear programming relaxation. 

Let Problem P pp be the linear program obtained from Problem P by relaxing 
the integrality constraints and constraints (14.4)-(14.6). That is, 

K R K 

Problem Pf p : Min ^ y P kC pk + ZZ z ek ) 

k=l pEPk e£SE r= 1 k= 1 

S.t. 

(14.1) - (14.3) 

V P k >0, VA; = 1,2, ... ,K, and p G P k , 
x r e >0, Ve G SE, and r = 1, 2, . . . , R, 
z r ek > 0, Ve G SE, Vfc = 1,2, ... ,K, 
and r = 1, 2, . . . , R. 

Chan, Muriel and Simchi-Levi (1999), prove the following. 
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Lemma 14.2.3 The optimal solution value to Problem P pp is equal to the optimal 
solution value to the linear programming relaxation of Problem P. 



14-2.3 Structural Properties 

To analyze the relaxed problem, we start by fixing the fractional path flows and 
study the behavior of the resulting linear program. Let y = (y p k) be the vector of 
path flows in a feasible solution to the relaxed linear program, Problem P pp . 

Observe that, given the vector of path flows y, the amount of each commodity 
sent on each edge is known and, thus, Problem P pp can be decomposed into 
multiple subproblems, one for every edge. Each subproblem determines the cost 
that the linear program associates with the corresponding edge flow. We refer to 
the subproblem associated with edge e as the fixed- flow subproblem on edge e, or 
Problem FFf- 

Let the proportion of commodity k shipped along edge e be 

7 ek = E 5 lVpk- 
p£Pk 

We use Equation (14.2); the equality i z lk = w kTek must clearly hold; that 
is, the sum of all the flows of commodity k on the different cost intervals on edge 
e must be equal to the total quantity, vj^Tek, of commodity k that is shipped on 
that edge. 

For each edge e, the total shipping cost on e, as well as the value of the corre- 
sponding variables z r ek and x r e , which Problem P pp associates with the vector of 
path flows y, can be obtained by solving the fixed- flow subproblem on edge e: 

R K 

Problem FF* : Min £[/X + < 5> e r fc ] 

r=l k = 1 

S.t. 

z r ek < rokX r e Mk = 1, . . . , K, and r = 1, . . . , R, (14.9) 

R 

T, z lk= w k 7efc, Vk = l,...,K, (14.10) 

r= 1 

z r ek > 0, Mk = 1, . . . , K, and r = 1, . . . , R, 
x r e > 0, Mr = 1, , R. 

Let C*(y ) = . . . , Jck) be the optimal solution to the fixed- flow sub- 

problem on edge e for a given vector of path flows y or, equivalently, for given 
corresponding proportions q e i, . . . ,7 e K of the commodities shipped on that edge. 
The following theorem determines the solution to the subproblem. 

Theorem 14.2.4 For any given edge e G SE, let the proportion 7 e /c of commodity 
k to be shipped on edge e be known and fixed, for k = 1,2, .... K , and let the 



14.2 The Shipper Problem 271 



commodities be indexed in nondecreasing order of their corresponding proportions, 
that is, 

7el < 7e2 < • • • < 7 eK- 

Then the optimal solution to the fixed-flow subproblem on edge e is 

K K 

C'*(7el,...,7e*r) = X7e(X7*)[ ^ ek “ 7efc-l], (14.11) 

k= 1 i=k 



where 7 e o := 0. 

Intuitively, the above theorem just says that in an optimal solution to the fixed- 
flow subproblem associated with any edge e, fractions of commodities are consol- 
idated to be shipped at the cheapest possible cost per unit. At first, a fraction 
7ei of all commodities 1, 2, . . . , K is available. Thus, these commodities get con- 
solidated to achieve a cost per unit of w k)/ ^2k=i Wk > that is, the cost 

per unit associated with sending the full K commodities on that edge, and the 
available fraction y e i is sent incurring a cost of leiFe(flS2k=i w k)- At that point, 
none of commodity 1 is left and a fraction (y e 2 — 7 e i) is the maximum available 
simultaneously from all commodities 2, 3, . . . , K. These commodities are consoli- 
dated again, and that fraction, (y e 2 — 7ei), from each commodity is sent at a cost 
( 7 e 2 — 7ei)^e(S/c=2 w k)- This process continues until the desired proportion of 
each commodity has been sent. 

14-2.4 Solution Procedure 

Theorem 14.2.4 provides a simple expression of the cost that the relaxed problem, 
Problem P pp , assigns to any given fractional path flows, and thus it allows for 
the efficient computation of the impact of modifying the flow in a particular path. 
This is the key to the algorithm developed in this section. Indeed, the algorithm 
transforms an optimal fractional solution to the linear program P pp into an integer 
solution by modifying path flows, choosing for each commodity the path that leads 
to the lowest increase in the objective of the linear program. 

The Linear Programming-Based Heuristic: 

Step 1: Solve the linear program, Problem P pp . Initialize k = 1. 

Step 2: For each arc, compute a marginal cost, which is the increase in cost 
incurred in the fixed-flow subproblem by augmenting the fractional flow of 
commodity k to 1. Note that this is easy to compute using Theorem 14.2.4. 

Step 3: Determine a path for commodity k by finding the minimum-cost path on 
the expanded network with edge costs equal to the marginal costs. 

Step 4 : Update the flows and the costs on each link (again employing Theo- 
rem 14.2.4) to account for commodity k being sent along that path. 
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Step 5: Let k = k + 1, and repeat steps (2)-(5) until k = K + 1. 

Evidently, the effectiveness of this heuristic depends on the tightness of the linear 
programming relaxation of Problem P. For this reason, we study the difference 
between integer and fractional solutions to Problem P. Chan et al. (1999), show 
that in some special cases an integer solution can be constructed from the optimal 
fractional solution of Problem P^ P without increasing its cost. In particular, using 
Theorem 14.2.4, they prove the following result. 

Theorem 14.2.5 In the following cases : 

1. single period , multiple suppliers, multiple retailers, two warehouses, 

2. two periods, single supplier, multiple retailers, single warehouse, 

3. two periods, multiple supplier, multiple retailers, single warehouse using a 
cross- docking strategy, 

4- multiple periods, single supplier, single retailer, single warehouse that uses a 
cross- docking strategy, 

the solution to the linear programming relaxation of Problem P is the optimal 
solution to the shipper problem. That is, 

Z* = Z LP . 

Furthermore, in the first three cases, all extreme point solutions to the linear pro- 
gram are integer. 

The cross-docking strategy referred to in the last two cases is a strategy in 
which the stores are supplied by central warehouses that do not keep any stock 
themselves. That is, in this strategy, the warehouses act as coordinators of the 
supply process and as transshipment points for incoming orders from outside ven- 
dors. 

The theorem thus demonstrates the exceptional performance of the linear pro- 
gramming relaxation, and consequently of the heuristic, in some special cases. A 
natural question at this point is whether these results can be generalized. The 
answer is no in general. To show this, Chan et al. (1999) construct examples with 
a single supplier, a single warehouse, and multiple retailers and time periods, for 
which 

Z* 



as the number of retailers and time periods increases. 

Lemma 14.2.6 The linear programming relaxation of Problem P can be arbitrar- 
ily weak, even for a single- supplier, single-warehouse, multiretailer case in which 
demand for the retailers is constant over time. 
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It is important to point out that the instances in which the heuristic solution is 
found to be arbitrarily bad are characterized by the unrealistic structure of the 
shipping cost. In these instances, the shipping cost between two facilities is a pure 
fixed charge (regardless of quantity shipped) in some periods, linear (with no fixed 
charges) in others, and yet prohibitively expensive so that nothing can be shipped 
in the remaining periods. The following examples illustrate this structure. 

Example of weak linear programming solution: Consider a three-period, 
single-warehouse model in which a single supplier delivers goods to a warehouse 
that, in turn, replenishes inventory of three retailers over time. The warehouse 
uses a cross-docking strategy; thus, it does not keep any inventory. Let the trans- 
portation cost be a fixed charge of 100 for any shipment from the supplier to 
the warehouse at any period. Transportation from the warehouse to retailer i, 
i = 1,2,3, is very large for shipments made in period i (in other words, retailer 
i cannot be reached in period i) and negligible for periods j ^ i. Let the inven- 
tory cost be negligible for all retailers at all periods, and let the demand for each 
retailer be 0 units in periods 1 and 2 and 100 units in period 3. 

Observe that in order to reach the three retailers, shipments need to be made in 
at least two different periods. Thus, the optimal integer solution is 200. However, 
in the solution to the linear program, 50 units are sent to each of the “reachable” 
retailers in each period, and a transportation cost of 50 is charged at each period 
(as stated in Theorem 14.2.4, since only a fraction of 1/2 of the commodities is 
sent on any edge, exactly that fraction of the fixed cost is charged). Thus, the 
optimal fractional solution is 150 and the ratio of integer to fractional solutions is 
3/2. 

In this instance, even if fractional and integer solutions are different, the lin- 
ear programming-based heuristic generates the optimal integer solution. However, 
we can easily extend the above scenario to instances for which the difference be- 
tween the solution generated by the heuristic and the optimal integer solution is 
arbitrarily large. 

Example of weak heuristic solution: For that purpose, we add n new periods 
to the above setting. In period 4, the first of the new periods, the cost for shipping 
from supplier to warehouse is linear at a rate of 1/3, and the cost for shipping from 
the warehouse to each of the three retailers is 0. In all the other n — 1 periods, the 
cost of shipping is very high, and thus no shipments will be made after period 4. 
Inventory costs at all retailers and all periods are negligible. Demand for each of 
the three retailers at each of the new n periods is 100, while demand during the 
first three periods is 0. It is easy to see that the optimal integer and fractional 
solutions are identical to those in the three-period case, with costs of 200 and 
150, respectively. However, the heuristic algorithm will always choose to ship each 
commodity in period 4, since the increase in cost in the corresponding path would 
be 1/3 x 100, while it is at least 50 in any of the first three periods. Thus, the 
total cost of the heuristic solution is 1/3x1 00 xn and the gap with the optimal 
integer solution arbitrarily large. 
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The following section reports the performance of the algorithm on a set of ran- 
domly generated instances. 

14-2.5 Computational Results 

The computational tests carried out are divided into three categories: 

1. single-period layered networks, 

2. general networks, 

3. multiperiod, single-warehouse distribution problems: 

• pure distribution instances. 

• production/distribution instances. 

The first two categories are of special interest because they allow us to compare 
our results with those reported by Balakrishnan and Graves (1989), henceforth 
B&G. The third set of problems models practical situations in which each of the 
retailers is assigned to a single warehouse and production and transportation costs 
have to be balanced with inventory costs over time. 

In the three categories, the tests were run on a Sun SPARC20 and CPLEX was 
used to solve the linear program, Problem P^p, using an equivalent formulation 
where path flow variables are replaced by flow-balance constraints. During our 
computational work, we observed that the dual simplex method is more efficient 
than the primal simplex method in solving these highly degenerate problems, an 
observation also made by Melkote (1996). This is usually the case for programs 
with variable upper-bound constraints, such as our constraints z r ek < WkX r e . We 
should also point out that most of the CPU time reported in our tests is used in 
solving the linear program. Thus, to enhance the computational performance of our 
algorithm and increase the size of the problems that it is capable of handling, we see 
the need for future research to focus on efficiently solving the linear program. For 
instance, the original set-partitioning formulation, Problem P/p, could be solved 
faster using column-generation techniques. In these tests, however, we focused on 
evaluating the quality of the integer solutions provided by the heuristic and the 
tightness of the linear programming relaxation. 

We now discuss each class of problems and the effectiveness of our algorithm. 

Single-Period Layered Networks 

B&G present exceptional computational results for single-period layered networks. 
In these instances, commodities flow from the manufacturing facilities to distribu- 
tion centers, where they are consolidated with other shipments. These shipments 
are then sent to a number of warehouses, where they are split and shipped to their 
final destinations. Thus, every commodity must go through two layers of interme- 
diate points: consolidation points, also referred to as distribution centers, and 
breakbulk points, or warehouses. 
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TABLE 14.1. Test problems generated as in Balakrishnan and Graves 



Number of 
nodes 


Problem class 


LTL1 


LTL2 


LTL3 


LTL4 


LTL5 


Source 


4 


5 


6 


8 


10 


Consolidn 


5 


10 


12 


15 


20 


Breakbulk 


5 


10 


12 


15 


20 


Destn 


4 


5 


6 


8 


10 


Arcs 


42-47 


131-141 


190-207 


309-312 


358-372 


Commodities 


10 


20 


30 


50 


60 



TABLE 14.2. Computational results for layered networks. Balakrishnan and Graves’ re- 
sults (B&G) vs. those of our linear programming-based heuristic (LPBH) 



Problem 

class 


B&G 


LPBH 


LB/UB 

percentage 


LP/heuristic 

percentage 


Avg. CPU time 
in seconds 


LTL1 


99.8 


100 


1.04 


LTL2 


100 


100 


7.94 


LTL3 


99.6 


100 


20.74 


LTL4 


99.1 


100 


55.72 


LTL5 


99.5 


100 


100.48 



To test the performance of our algorithm and to compare it with that of B&G, 
we generated instances of the layered networks following the details given in their 
paper. In this computational work, we considered five different problem classes, 
referred to as LTL1-LTL5. 

Table 14.1 shows the sizes of the different classes of problems. For each of these 
classes, the first column (B&G) of Table 14.2 presents the average ratio between 
the upper bounds generated by the heuristic proposed by B&G and a lower bound 
on the optimal solution, over five randomly generated instances. The numbers 
are taken from their paper. We do not include, though, their average CPU times 
because the machines they use are completely different than ours and, in addition, 
they do not report the total computational time for the entire algorithm. The 
second and third columns report the average deviation from optimality and the 
computational performance of the linear programming-based heuristic (LPBH) 
over 10 random instances, for each of the problem classes. In all of them, our 
algorithm finds the optimal integer solution; furthermore, the solution to the 
linear program in the first step of our algorithm is integer, providing the optimal 
solution to the problem. 

Of course, since in all previous instances, the linear program provided the opti- 
mal integer solution, the performance of our procedure has not really been tested. 
In the following subsections, we present computational results for problem classes 
in which the solution to the linear program is not always integer. 
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TABLE 14.3. Computational results for general networks. Balakrishnan and Graves’ 



results (B&G) vs. those of our linear programming-based heuristic (LPBH) 



Problem 

class 


Size 


B&G 


LPBH 


No. of 
nodes 


No. of 
arcs 


No. of 
comm. 


LB/UB 

percentage 


LP/Heuristic 

percentage 


Avg. CPU time 
in seconds 


GENl 


10 


47-54 


10 


99.9 


100 


2.18 


GEN2 


15 


109-136 


20 


98.7 


99.53 


24.04 


GEN3 


20 


196-235 


30 


98.4 


99.88 


139.83 


GEN4 


30 


364-428 


50 


96.2 


98.59 


1313.06 


GEN5 


40 


340-370 


60 


98.5 


99.98 


159.57 



General Networks 

In this subsection, we report on the performance of our algorithm on general 
networks, in which every node can be an origin and/or a destination, generated 
exactly as they are generated by B&G. These results, together with those of B&G, 
are reported in Table 14.3. In this category, B&G consider five different problem 
classes, referred to as GEN1,. . ., GEN5, and generate five random instances for 
each of them. We, in turn, solve 10 different randomly generated instances for each 
of the problem classes. Again, we do not include their average CPU times due to 
the reasons mentioned above. 

Multiperiod, Single- Warehouse Distribution Problems 

Here we consider a single-warehouse model where a set of suppliers replenishes 
the inventory of a number of retailers over time. We test two different types of in- 
stances: pure distribution instances in which the routing and timing of shipments 
are to be determined, and production/distribution instances in which the produc- 
tion schedule is also integrated with the transportation and inventory decisions. 

A. Pure Distribution Instances 

We assume that shortages are not allowed and analyze three different strategies: 

1. Classical inventory /distribution strategy: Material always flows from the sup- 
pliers through a single warehouse, where it can be held as inventory. 

2. Cross- docking strategy: All material flows through the warehouse, where ship- 
ments are reallocated and immediately sent to the retailers. 

3. A distribution strategy that allows for direct shipments: Items may be sent 
either through the warehouse or directly to the retailer. The warehouse may 
keep inventory. 

For each strategy, we analyze different situations where the number of suppliers 
is either 1, 2, or 5, the number of retailers is 10, 12, or 20, and the number of periods 
is 8 or 12. For each combination of the number of suppliers, retailers, and periods 
presented in Table 14.6, 10 instances are generated. The retailers and suppliers 
are randomly located on a 1000 x 1000 grid, while the warehouse is randomly 
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TABLE 14.4. Linear and setup costs used for all the test problems 



Type of arc 


a\ 


a 2 e 


a 3 e 


Setup 


Supplier-whs. 


0.15 


0.105 


0.084 


25 


Whs. -retailer 


0.25 


0.20 


0.16 


10 



TABLE 14.5. Inventory costs and different ranges for the different test problems 



Problem 


Inventory cost 


Supplier-whs. cost 


Whs. -retailer cost 


Class 


Warehouse 


Retailer 


Range 1 


Range 2 


Range 1 


Range 2 


11 










200 


400 


12 


5 


10 


800 


1500 


300 


600 


13 










300 


600 


14 










150 


300 


15 


10 


20 


1000 


2000 


200 


400 


16 










200 


400 


Cl 










200 


400 


C2 


10 


20 


800 


1500 


300 


600 


C3 










300 


600 


C4 










150 


300 


C5 


10 


20 


1000 


2000 


200 


400 


C6 










200 


400 


D1 










150 


300 


D2 


10 


20 


500 


1000 


200 


400 


D3 










200 


400 



assigned to the 400 x 400 subgrid at the center. Demand is generated for each 
retailer-supplier pair at each time period, except for the cases with five suppliers, 
in which each of these pairs has an associated demand with probability 1/3. These 
demands are generated from a uniform distribution on the integers in the interval 
[ 0 , 100 ). 

All suppliers and retailers are linked to the warehouse, and the distance associ- 
ated is the corresponding Euclidean distance between the nodes of the grid. In the 
case of a distribution strategy that allows for direct shipments , shipping edges from 
each of the suppliers to each of the retailers are added. The holding costs per unit 
of inventory are different at the warehouses and retailer facilities and are presented 
in Table 14.5. All holding costs at the suppliers are set to zero. Two shipping cost 
functions, representing the cost per item per unit distance, are considered: The 
first is assigned to shipments from the suppliers to the warehouse. The second 
is incurred by the material flowing from the warehouse to the retailers. The cost 
function (dollars per mile per unit) associated with direct shipments is equal to 
that of shipments from the warehouse to a retailer. Both functions have an ini- 
tial setup cost for using the link and three different linear rates depending on the 
quantity shipped; see Table 14.4. However, the ranges to which those linear costs 
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TABLE 14.6. Computational results for a single warehouse 





Problem 


Number of 


Number of 


Number of 


LP/heuristic 


CPU time 


Strategy 


class 


suppliers 


stores 


periods 


percentage 


in seconds 




11 


1 






100 


65.21 


Classical 


12 


2 


10 


12 


100 


187.37 


inventory/ 


13 


5 






100 


163.23 


distribution 


14 


1 






99.946 


83.5 


strategy 


15 


2 


20 


8 


100 


210.51 




16 


5 






99.953 


200.68 




Cl 


1 






100 


60.0 




C2 


2 


10 


12 


100 


174.13 


Cross- 


C3 


5 






100 


159.06 


docking 

strategy 


C4 


1 






100 


79.73 




C5 


2 


20 


8 


100 


202.83 




C6 


5 






100 


186.0 


Direct 


D1 


1 






100 


51.23 


shipments 


D2 


2 


12 


8 


100 


165.83 


allowed 


D3 


5 






99.921 


117.27 



correspond are different for the different problem classes. This is done so that, in 
an optimal solution, shipments are consolidated, and thus the concave cost func- 
tion plays an important role in the analysis. These ranges and the corresponding 
problem classes are presented in Table 14.5. 

Observe (see Table 14.6) that in most of the instances tested, the linear program 
is tight and it provides the optimal integer solution. In only three of the 150 
instances generated is the solution to the linear program not integer; in such cases, 
our algorithm finds a solution that is within 0.8% from the optimal fractional 
solution. 

B. Production/Distribution Instances 

This section demonstrates the effectiveness of the algorithm when applied to pro- 
duction/distribution systems, that is, systems in which one needs to coordinate 
production planning, inventory control, and transportation strategies over time. 
For that purpose, we consider the same set of problems, 11-13, as in the classical 
inventory /distribution strategy described in the previous section, and add produc- 
tion decisions at each of the supplier sites. This is incorporated into the model, as 
explained in Sect. 14.2. 

We consider a fixed setup cost for producing at any period plus a certain cost 
per unit. The setup cost is varied in the set {50,100,500,1000}, and the linear 
production cost is set to 1. The inventory holding rate at the supplier site (after 
production) is set to half of that at the warehouse. For the 60 different instances 
generated, the linear programming relaxation gave an integer solution every time. 
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14.3 Safety Stock Optimization 

As observed earlier, the shipper model analyzed earlier is deterministic; safety 
stocks are determined exogenously and incorporated into the minimum inventory 
level that should be maintained at the beginning of each period. The objective of 
this section is to present a model for positioning and optimizing safety stock in 
the supply chain. 

For this purpose, consider a single-product, single-facility, periodic-review in- 
ventory model. Let SI be the amount of time it takes from placing an order until 
the facility receives a shipment; this time is referred to as the incoming service 
time. Let S be the committed service time made by the facility to its own 
customers, and let T be the processing time at the facility. Of course, we must 
assume that SI + T > S; otherwise, no inventory is needed in the facility. 

We assume that the facility manages its inventory following a periodic review 
policy and that demand is independent and identically distributed across time 
periods following a normal distribution. Given deterministic SI, S, and T, and 
with no setup costs, the level of safety stock that the facility needs to keep (see 
Exercise 14.1) is equal to 



zhy/SI + T-S, 

where z is the safety stock factor associated with a specified level of service and 
h is the inventory holding cost. The value SI + T — S is referred to as the facility 

net lead time. 

To understand our model, consider the following two-stage supply chain with 
facility 2 feeding facility 1, which serves the end customer (Fig. 14.3). Define Sli, 
Si , and Ti as before for i = 1,2. Thus, Si is the committed service time to the 
end customer, S 2 is the commitment that facility 2 makes to facility 1, and hence 
S 2 = SI%. Finally, SI 2 is the supplier commitment to facility 2. 

Our objective is to minimize the total supply chain cost without affecting, or 
pushing, inventory to the supplier. Observe that if we reduce S2 = SI\, we will 
affect inventory at both facility 1 and facility 2. Indeed, by reducing the committed 
service time that facility 2 makes to facility 1, inventory at facility 1 is reduced, 
but inventory at the second facility is increased. Thus, our objective is to develop a 
model that selects the appropriate level of commitment that one facility makes to 
its downstream facility so as to minimize the total, or more precisely, systemwide 
safety stock cost. 




■ customers 



FIGURE 14.3. Illustration of the model 
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For this purpose, consider a supply network G(N,A ), which is acyclic, with N 
facilities and where A is the set of edges. Let D C N be the set of customers, or 
demand points, with Sj being an upper bound on the commitment to be made to 
customer j, j E D. 

Following our discussion, we formulate the problem of setting commitments and 
safety stock levels as the following nonlinear optimization problem. 



where &(Xj) = ~Xj . 

Observe that in this formulation, there are two sets of decision variables. The 
first is Sj , the commitment made by facility j to all its downstream facilities. The 
second is the implied incoming service time to facility j. This incoming service 
time is the maximum of the committed service time of all the upstream facilities 
feeding facility j. 

Thus, constraint (14.12) defines the net lead time at facility j . Constraint (14.14) 
forces the incoming service time for facility j to be no smaller than the commitment 
that each facility i with (i, j) E A makes to facility j. Finally, constraint (14.15) 
forces the commitment to the end customer to be no larger than the target. 

Of course, the challenge is to solve this formulation effectively. Graves and 
Willems (2000) propose a dynamic programming algorithm, while Magnanti et al. 
(2003) develop an efficient algorithm based on a similar approach to what is 
described earlier for the shipper problem. 

14.4 Exercises 



Exercise 14.1. Consider a single-product, single- facility, infinite-horizon, periodic- 
review model. Assume that the inventory is managed based on a stationary base 
stock policy and unsatisfied demand is backlogged. At each period, demand arrives 
according to a normal distribution 7V(/q a). (Let’s assume that the probability of 
having negative demand is negligible.) Let SI be the incoming service time, S be 
the committed service time, and T be the processing time at the facility. Finally, 
assume that the initial inventory level equals the base stock level and assume the 
service level is <a, which is defined as the probability that the demand can be 



N 



Problem SS : Min 




s.t. 



Xj = SIj + Tj — Sj. V j = 1 , . . . , TV, 
*i>0, V j = 1 , . . . , TV, 

SIj- Si > 0, V(i, j) E A, 
Sj<Sj , Vj E D, 

Sj,SIj> 0, V j = 1 , — ,7V, 



(14.12) 

(14.13) 

(14.14) 

(14.15) 

(14.16) 



14.4 Exercises 
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fully satisfied from current on-hand inventories. Show that the base stock level is 
(SI + T — S)/i + 'fr- 1 (a)\/SI + T — Sa , where 4/ _1 is the inverse of the cumulative 
distribution function of the standard normal distribution. 

Exercise 14.2. Assume, in Problem SS of Sect. 14.3, that the supply network 
reduces to a serial supply chain. Show that Si = 0 or Si = Si + 1 + TJ, where 
stage i + 1 serves stage i. Based on this observation, propose an algorithm to solve 
Problem SS. 



15 

Facility Location Models 



15.1 Introduction 

One of the most important aspects of logistics is deciding where to locate new 
facilities such as retailers, warehouses, or factories. These strategic decisions are a 
crucial determinant of whether materials will flow efficiently through the distribu- 
tion system. 

In this chapter, we consider several important warehouse location problems: the 
p -median problem; the single-source capacitated facility location problem; and a 
distribution system design problem. In each case, the problem is to locate a set 
of warehouses in a distribution network. We assume that the cost of locating a 
warehouse at a particular site includes a fixed cost (e.g., building costs, rental 
costs, etc.) and a variable cost for transportation. This variable cost includes the 
cost of transporting the product to the retailers as well as possibly the cost of 
moving the product from the plants to the warehouse. In general, the objective is 
to locate a set of facilities so that the total cost is minimized subject to a variety 
of constraints, which might include 

• Each warehouse has a capacity, which limits the area it can supply. 

• Each retailer receives shipments from one and only one warehouse. 

• Each retailer must be within a fixed distance of the warehouse that supplies 
it, so that a reasonable delivery lead time is ensured. 
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Location analysis has played a central role in the development of the operations 
research field. In this area lie some of the discipline’s most elegant results and 
theories. We note here the paper of Cornuejols et al. (1977) and the two excel- 
lent books devoted to the subject by Mirchandani and Francis (1990) and Daskin 
(1995). Location problems encompass a wide range of problems such as the loca- 
tion of emergency services (firehouses or ambulances), the location of hazardous 
materials, and problems in telecommunications network design, just to name a 
few. 

In the next section, we present an exact algorithm for one of the simplest location 
problems, the p -median problem. We then generalize this model and algorithm to 
incorporate additional factors important to the design of the distribution network, 
such as warehouse capacities and fixed costs. In Sect. 15.4, we present a more 
general model where all levels of the distribution system (plants and retailers) 
are taken into account when deciding warehouse locations. We also present an 
efficient algorithm for its solution. All of the algorithms developed in this chapter 
are based on the Lagrangian relaxation technique described in Sect. 6.3, which 
has been applied successfully to a wide range of location problems. Finally, in 
Sect. 15.5, we describe the structure of the optimal solution to problems in the 
design of large-scale logistics systems. 

15.2 An Algorithm for the p -Median Problem 

Consider a set of retailers geographically dispersed in a region. The problem is to 
choose where in the region to locate a set of p identical warehouses. We assume 
there are m > p sites that have been preselected as possible locations for these 
warehouses. Once the p warehouses have been located, each of n retailers will get 
its shipments from the warehouse closest to it. We assume the following: 

• There is no fixed cost for locating at a particular site. 

• There is no capacity constraint on the demand supplied by a warehouse. 

Note that the first assumption also encompasses the case where the fixed cost is 
not site-dependent, and therefore the fixed setup cost for locating p warehouses is 
independent of where they are located. 

Let the set of retailers be TV, where 7V = {l,2,...,n}, and let the set of potential 
sites for warehouses be M, where M = {1, 2, . . . , ra}. Let Wi be the demand or 
flow between retailer i and its warehouse for each i G TV. We assume that the cost 
of transporting the Wi units of product from warehouse j to retailer i is Qj, for 
each i G TV and j G M. 

The problem is to choose p of the m sites where a warehouse will be located in 
such a way that the total transportation cost is minimized. This is the p -median 
problem. 

The continuous version of this problem, where any point is a potential warehouse 
location, was first treated as early as 1909 by Weber. The discrete version was 
analyzed by Kuehn and Hamburger (1963) as well as Hakimi (1964), Manne (1964), 
Balinski (1965), and many others. 
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We present here a highly effective approach to the problem. Define the following 
decision variables: 

f 1, if a warehouse is located at site j, 

Yj = < 

[ 0, otherwise, 

for j G M, and 

f 1, if retailer i is served by a warehouse at site j, 

\ 0, otherwise, 

for i G N and j G M. The p -median problem is then 

n m 

y^ y^ c ijXij 

i= 1 j= 1 

m 

J2Xij = l, VieN, (15.1) 

J=1 
m 

J2 Y J=P> (15-2) 

3= 1 

XijKYj , Vi G TV, j G M, (15.3) 

G {0, 1}, Vi G TV, j G M. (15.4) 

Constraints (15.1) guarantee that each retailer is assigned to a warehouse. Con- 
straint (15.2) ensures that p sites are chosen. Constraints (15.3) guarantee that a 
retailer selects a site only from among those that are chosen. Constraints (15.4) 
force the variables to be integer. 

This formulation can easily handle several side constraints. If a handling fee is 
charged for each unit of product going through a warehouse, these costs can be 
added to the transportation cost along all arcs leaving the warehouse. Also, if a 
particular limit is placed on the length of any arc between retailer i and warehouse 
j, this can be incorporated by simply setting the per-unit shipping cost (cij) to 
Too. In addition, the model can be easily extended to cases where a set of facilities 
is already in place and the choice is whether to open new facilities or expand the 
existing facilities. 

Let Z* be an optimal solution to Problem P. One simple and effective technique 
to solve this problem is the method of Lagrangian relaxation described in Sect. 6.3. 

As described in Sect. 6.3, Lagrangian relaxation involves relaxing a set of con- 
straints and introducing them into the objective function with a multiplier vector. 
This provides a lower bound on the optimal solution to the overall problem. Then, 
using a subgradient search method, we iteratively update our multiplier vector in 
an attempt to increase the lower bound. At each step of the subgradient procedure 
(i.e., for each set of multipliers), we also attempt to construct a feasible solution 
to the location problem. This step usually consists of a simple and efficient sub- 
routine. After a prespecified number of iterations, or when the solution found is 
within a fixed error tolerance of the lower bound, the algorithm is terminated. 



Problem P : Min 



s.t. 
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To solve the p -median problem, we choose to relax constraints (15.1). We in- 
corporate these constraints in the objective function with the multiplier vector 
A G M n . The resulting problem, call it Py, with optimal objective function value 
Z\, is 

n m n m 

Min X X c b Xi i + X Xi ( X Xi i - *) 

2=1 j=l 2=1 j= 1 

subject to (15.2) — (15.4). 

Disregarding constraint (15.2) for now, the problem decomposes by site; that is, 
each site can be considered separately. Let Subproblem P x , with optimal objective 
function value Z x , be the following: 

n 

Min X(cii+ \i)Xij 
2=1 

s.t. Xij X Yj , Vi G N, 

Xij G {0, 1}, Vi G N, 

Yj G{0,1}- 

Solving Subproblem P x 

Assume A is fixed. In Problem PjJ, site j is either selected (Yj = 1) or not 
(Yj = 0). If site j is not selected, then = 0 for all i G iV, and therefore, 
Z 3 X = 0. If site j is selected, then we set Yj — 1 and assign exactly those retailers 
i with Cij + A^ < 0 to site j. In this case: 

n 

Z{ = min {cjj + Xj, 0}. (15.5) 



We see that P f is solved easily and its optimal objective function value is given 
by (15.5). 

To solve P \ , we must now reintroduce constraint (15.2). This constraint forces 
us to choose only p of the m sites. In Py, we can incorporate this constraint by 
choosing the p sites with smallest values Z° x . To do this, let tt be a permutation of 
the numbers 1, 2, . . . , m such that 

Z^M < < ^*(3) < < ^Tr(ra) 

Then the optimal solution to Py has the objective function value: 

p n 

z> = £ z A' al -I> 

3=1 3=1 
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The value Z\ is a lower bound on the optimal solution of Problem P for any 
vector A G M n . To find the best such lower bound, we consider the Lagrangian 
dual: 

max{Z\). 

A 

Using the subgradient procedure (described in Sect. 6.3), we can iteratively im- 
prove this bound. 

Upper Bounds 

It is crucial to construct good upper bounds on the optimal solution value as the 
subgradient procedure advances. Clearly, solutions to P\ will not necessarily be 
feasible to Problem P. This is due to the fact that the constraints (15.1) (that each 
retailer choose one and only one warehouse) may not be satisfied. The solution to 
P\ may have facilities choosing a number of sites. If, in the solution to P\, each 
retailer chooses only one site, then this must be the optimal solution to P, and 
therefore, we stop. Otherwise, there are retailers that are assigned to several or 
no sites. A simple heuristic can be implemented that fixes those retailers that are 
assigned to only one site and assigns the remaining retailers to these and other 
sites by choosing the next site to open in the ordering defined by i r. When p sites 
have been selected, we can do a simple check that each retailer is assigned to its 
closest site (of those selected); doing so can further improve the solution. 

Computational Results 

Below we give a table listing results of various computational experiments 
(Table 15.1). The retailer locations were chosen uniformly over the unit square. 
For simplicity, we made each retailer location a potential site for a warehouse; 
thus, m = n. The cost of assigning a retailer to a site was the Euclidean dis- 
tance between the two locations. The values of Wi were chosen uniformly over the 
unit interval. We applied the algorithm mentioned above to many problems and 
recorded the relative error of the best solution found and the computational time 
required. The algorithm is terminated when the relative error is below 1% or when 
a prespecified number of iterations is reached. The numbers below “Error” are the 
relative errors averaged over 10 randomly generated problem instances. The num- 
bers below “CPU time” is the CPU time averaged over the 10 problem instances. 
All computational times are on an IBM RISC 6000 Model 950. 



15.3 An Algorithm for the Single-Source Capacitated 
Facility Location Problem 

Consider the p -median problem, where we make the following two changes in our 
assumptions: 



• The number of warehouses to locate (p) is not fixed beforehand. 
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TABLE 15.1. Computational results for the p -median algorithm 



n 


P 


Error 


CPU time 


10 


3 


0.3% 


0.2s 


20 


4 


1.7% 


2.6s 


50 


5 


1.4% 


20.7s 


100 


7 


1.3% 


87.7s 


200 


10 


2.4% 


715.4s 



• If a warehouse is located at site j : 

o A fixed cost fj is incurred, and 

o There is a capacity qj on the amount of demand it can serve. 

The problem is to decide where to locate the warehouses and then how the retailers 
should be assigned to the open warehouses in such a way that the total cost is 
minimized. We see that the problem is considerably more complicated than the 
p -median problem. We now have capacity constraints on the warehouses, and 
therefore a retailer will not always be assigned to its nearest warehouse. Allowing 
the optimization to choose the appropriate number of warehouses also adds to the 
level of difficulty. 

This problem is called the single-source capacitated facility location problem 
(CFLP), or sometimes the capacitated concentrator location problem (CCLP). 
This problem was successfully used in Chap. 17 as a framework for solving the 
capacitated vehicle routing problem. 

Using the same decision variables as in the p -median problem, we formulate the 
single-source CFLP as the following integer linear program: 

n rri m 

Min EE c ijXij + ^2 fjYj 

i=l j= 1 j= 1 

m 

s.t. J2 x ij = 1> Vie AT, (15.6) 

3 = i 

n 

^WiXij < qjYj, Vi e M, (15.7) 

i= 1 

Xij,Yj G {0, 1}, Vi G TV, j G M. (15.8) 

Constraints (15.6) (along with the integrality conditions (15.8)) ensure that each 
retailer is assigned to exactly one warehouse. Constraints (15.7) ensure that the 
warehouse’s capacity is not exceeded and also that if a warehouse is not located 
at site j, no retailer can be assigned to that site. 

Let Z* be the optimal solution value of single-source CFLP. Note we have re- 
stricted the assignment variables ( X ) to be integer. A related problem, where this 
assumption is relaxed, is simply called the (multiple-source) capacitated facility 
location problem. In that version, a retailer’s demand can be split between any 
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number of warehouses. In the single-source CFLP, it is required that each retailer 
have only one warehouse supplying it. In many logistics applications, this is a 
realistic assumption since without this restriction, optimal solutions might have 
a retailer receive many deliveries of the same product (each for, conceivably, a 
very small amount of the product). Clearly, from a managerial, marketing, and 
accounting point of view, restricting deliveries to come from only one warehouse 
is a more appropriate delivery strategy. 

Several algorithms have been proposed to solve the CFLP in the literature; all 
are based on the Lagrangian relaxation technique. This includes Neebe and Rao 
(1983), Barcelo and Casanovas (1984), Klincewicz and Luss (1986), and Pirkul 
(1987). The one we derive here is similar to Pirkul’s algorithm, which seems to be 
the most effective. 

We apply the Lagrangian relaxation technique by including constraints (15.6) 
in the objective function. For any A G lR n , consider the following problem P\\ 

n m rn n m 

Min + y^ + y^ k ^ y^ w? i) 

i=i j = i j = i i = i j = i 

subject to (15.7) — (15.8). 

Let Z\ be its optimal solution and note that 

Z\ < Z\ MX G M n . 

To solve Py, as in the p -median problem, we separate the problem by site. For a 
given j G M, define the following problem Pj[, with the optimal objective function 
value Z J X : 

n 

Min + + fi Y i 

i= 1 
n 

s.t. y WjXjj < qjYj, 

Xij G {0, 1}, Mi G IV, 

^7 G {0,1}. 



Solving P° x 

Problem P x can be solved efficiently. In the optimal solution to P x , Yj is either 
0 or 1. If Yj = 0, then = 0 for all i G TV. If Yj = 1, then the problem is no 
more difficult than a single constraint 0-1 knapsack problem, for which efficient 
algorithms exist; see, for example, Nauss (1976). If the optimal knapsack solution 
is less than — /j, then the corresponding optimal solution to P^ is found by setting 
Yj = 1 and according to the knapsack solution, indicating whether retailer i 



290 



15. Facility Location Models 



is assigned to site j. If the optimal knapsack solution is more than — /j, then the 
optimal solution to P x is found by setting Yj = 0 and Xij = 0 for all i G N. 

The solution to P\ is then given by 

m n 

^ = E Z GE A ‘- 

j= 1 i = 1 

For any vector A G iR n , this is a lower bound on the optimal solution Z*. In order 
to find the best such lower bound, we use a subgradient procedure. 

Note that if the problem has a constraint on the number of warehouses (facilities) 
that can be opened (chosen), this can be handled in essentially the same way as 
it was handled in the algorithm for the p -median problem. 

Upper Bounds 

For a given set of multipliers, if the values {X} satisfy (15.6), then we have 
an optimal solution to Problem P, and we stop. Otherwise, we perform a simple 
subroutine to find a feasible solution to P. The procedure is based on the observa- 
tion that the knapsack solutions found when solving P\ give us some information 
concerning the benefit of setting up a warehouse at a site (relative to the current 
vector A). If, for example, the knapsack solution corresponding to a given site is 
0, that is, the optimal knapsack is empty, then this is most likely not a “good” 
site to select at this time. In contrast, if the knapsack solution has a very negative 
cost, then this is a “good” site. Given the values Z J X for each j G M, let it be a 
permutation of 1, 2, . . . , m such that 

Zl (1) <Zl {2) 

The procedure we perform allocates retailers to sites in a myopic fashion. Let 
M be the minimum possible number of warehouses used in the optimal solution 
to CFLP. This can be found by solving the bin-packing problem defined on the 
values Wi with bin capacities Qj\ see Sect. 4.2. Starting with the “best” site, in 
this case site 7r(l), assign the retailers in its optimal knapsack to this site. Then, 
following the indexing of the knapsack solutions, take the next- “best” site [say site 
j = 7r(2)] and solve a new knapsack problem: one defined with costs Cij = Cij + A i 
for each retailer i still unassigned. Assign all retailers in this knapsack solution to 
site j. If this optimal knapsack is empty, then a warehouse is not located at that 
site, and we go on to the next site. Continue in this manner until M warehouses 
are located. 

The solution may still not be a feasible solution to P since some retailers may 
not be assigned to a site. In this case, unassigned retailers are assigned to sites 
that are already chosen, where they fit with minimum additional cost. If needed, 
additional warehouses may be opened following the ordering of it. A local im- 
provement heuristic can be implemented to improve on this solution, using simple 
interchanges between retailers. 
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Computational Results 

We now report on various computational experiments using this algorithm 
(Table 15.2). The retailer locations were chosen uniformly over the unit square. 
Again, for simplicity, we made each retailer location a potential site for a ware- 
house; thus, m = n. The fixed cost of a site was chosen uniformly between 0 and 
1. The cost of assigning a retailer to a site was the Euclidean distance between 
the two locations. The values of Wi were chosen uniformly over the interval 0 to 
\ with warehouse capacity equal to 1. We applied the algorithm mentioned above 
to 10 problems and recorded the average relative error of the best solution found 
and the average computation time required. The algorithm is terminated when the 
relative error is below 1% or when a prespecified number of iterations is reached. 
The numbers below “Error” are the relative errors averaged over the 10 randomly 
generated problem instances. The numbers below “CPU Time” are the CPU time 
averaged over the 10 problem instances. All computational times are on an IBM 
RISC 6000 Model 950. 

TABLE 15.2. Computational results for the single-source CFLP algorithm 



n 


Error 


CPU time 


10 


1.2% 


1.2s 


20 


1.0% 


8.1s 


50 


1.1% 


110.0s 


100 


1.1% 


558.3 s 



15.4 A Distribution System Design Problem 

So far, the location models we have considered have been concerned with minimiz- 
ing the costs of transporting products between warehouses and retailers. We now 
present a more realistic model that considers the cost of transporting the product 
from manufacturing facilities to the warehouses as well. 

Consider the following warehouse location problem. A set of plants and retailers 
are geographically dispersed in a region. Each retailer experiences demands for a 
variety of products that are manufactured at the plants. A set of warehouses must 
be located in the distribution network from a list of potential sites. 

The cost of locating a warehouse includes the transportation cost per unit from 
warehouses to retailers but also the transportation cost from plants to warehouses. 
In addition, as in the CFLP, there is a site-dependent fixed cost for locating each 
warehouse. 

The data for the problem are the following: 

• L = number of plants; we will also let L = {1, 2, . . . , L}; 

• J = number of potential warehouse sites; also let J = {1, 2, , J}; 
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• / = number of retailers; also let / = {1,2,...,/}; 

• K = number of products; also let K = {1,2,...,//}; 



• W = number of warehouses to locate; 



• C£jk = cost of shipping one unit of product k from plant £ to 
warehouse site j\ 



• djik = cost of shipping one unit of product k from warehouse 
site j to retailer i; 



• fj = fixed cost of locating a warehouse at site j ; 

• V£k = supply of product k at plant /; 

• Wik = demand for product k at retailer i; 

• Sk = volume of one unit of product k ; 

• qj = capacity (in volume) of a warehouse at site j. 

We make the additional assumption that a retailer gets delivery for a product 
from one warehouse only. This does not preclude solutions where a retailer gets 
shipments from different warehouses, but these shipments must be for different 
products. On the other hand, we assume that the warehouse can receive shipments 
from any plant and for any amount of product. 

The problem is to determine where to locate the warehouses, how to ship the 
product from the plants to the warehouses, and also how to ship the product from 
the warehouses to the retailers. This problem is similar to that analyzed by Pirkul 
and Jayaraman (1996). 

We again use a mathematical programming approach. Define the following de- 
cision variables: 



Uijk = amount of product k shipped from plant £ to warehouse j, 
for each £ G T, j G J, and k G K. Also, define 




1, if a warehouse is located at site j 
0, otherwise, 



and 




1, if retailer i receives product k from warehouse j, 
0, otherwise, 



for each j G J, i G /, and k G K. 
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Then the distribution system design problem can be formulated as the following 
integer program: 



L J K I J K J 

Min EEE C£j k U Ij k T EEE djik'UJikXjik T E^ 

1= 1 j= 1 k= 1 i= 1 j= 1 /c=l j = l 

J 



Xjik — 

n = 1 


V* e /, k e K. 


(15.9) 


J — ± 

I K 






^ ^ ^ ^ Sk'WjkXjik E QjYji 


Vj G J, 


(15.10) 


i= 1 k= 1 






I L 






^ wikXjik = y ^ v r ijkt 


Vj G J, k G K, 


(15.11) 


i= 1 1= 1 






J 






y ^ Ui Jk e vik 

3= 1 


\/£ G L, k&K , 


(15.12) 


ii 

-W4 




(15.13) 


j — 

Yj^Xpk G {0, 1}, 


Vi G I, j G J, k G K, 


(15.14) 


Uijk E 0, 


W G L,j £ J,k £ K. 


(15.15) 



The objective function measures the transportation costs between plants and ware- 
houses, those costs between warehouses and retailers, and also the fixed cost of 
locating the warehouses. Constraints (15.9) ensure that each retailer /product pair 
is assigned to one warehouse. Constraints (15.10) guarantee that the capacity of 
the warehouses is not exceeded. Constraints (15.11) ensure that there is a con- 
servation of the flow of products at each warehouse; that is, the amount of each 
product arriving at a warehouse from the plants is equal to the amount being 
shipped from the warehouse to the retailers. Constraints (15.12) are the supply 
constraints. Constraints (15.13) ensure that we locate exactly W warehouses. 

The model can handle several extensions, such as a warehouse handling fee or 
a limit on the distance of any link used just as in the p -median problem. Another 
interesting extension is when there is a fixed number of possible warehouse types 
from which to choose. Each type has a specific cost along with a specific capacity. 
The model can easily be extended to handle this situation (see Exercise 15.1). 

As in the previous problems, we will use Lagrangian relaxation. We relax con- 
straints (15.9) (with multipliers A ^) and constraints (15.11) (with multipliers Ojk)- 
The resulting problem is 
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L J K J I K J 

Min EEE C-ljkUijk + EEE djik'WikXjik + E/^- 

1=1 j=l k= 1 3=1 i= 1 k=l j=l 



J K 



I K 



EE Ojk [e WikXjik - ^ Utji c j + EE ^ ik 



3= 1 k= 1 



i= 1 



1=1 



i= 1 k= 1 



subject to (15.10), (15.12) - (15.15). 



j 

El Xjik 5 

3=1 



Let Z\j be the optimal solution to this problem. This problem can be decom- 
posed into two separate problems Pi and P 2 : 

L J K 

Problem Pi : Zi =Min E E ^2i c ijk ~ 0jk]Uejk 

1=1 j=i k=l 

j 

s.t. E Ujjk < G P, k e K, (15.16) 

3=1 

Uijk >0, W g L, j g J, & g K. 



J I K J 

Problem P 2 1 E 2 — Min ^ ^ ^ ^ ^ H - @ jki^ik\X ji k T ^ ^ fjYj, 

3=1 i= 1 k= 1 j=l 

I K 

s.t. ^2^2 s kW ik X j ik < Qj Yj , Vj G J (15.17) 

2=1 fc=l 
J 

E y i = P’ ( 15 - 18 ) 

3=1 

Yj 1 X jik e {0,1}, Vi G /, j G J,fc G K. 



Solving Pi 

Problem Pi can be solved separately for each plant /product pair. In fact, the 
objective functions of each of these subproblems can be improved (without loss in 
computation time) by adding the constraints 

L 

s k E Uijk < qj, Vj e J, k e K. (15.19) 

1=1 

For each plant /product combination, say plant i and product &, sort the J values 
Cj = C£j k — Ojk . Starting with the smallest value of Cj, say ey, if Cf > 0, then the 
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solution is to ship none of this product from this plant. If < 0, then ship as 
much of this product as possible along arc (£,j r ) subject to satisfying constraints 
(15.16) and (15.19). Then if the supply V£k has not been completely shipped, do 
the same for the next-cheapest arc, as long as it has negative reduced cost (c). 
Continue in this manner until all of the product has been shipped or the reduced 
costs are no longer negative. Then proceed to the next plant/product combination, 
repeating this procedure. Continue until all the plant/product combinations have 
been scanned in this fashion. 

Solving P2 

Solving Problem P 2 is similar to solving the subproblem in the CFLP. For now, 
we can ignore constraints (15.18). Then we separate the problem by warehouse. In 
the problem corresponding to warehouse j, either Yj = 0 or Yj = 1. If Yj =0, then 
Xjik = 0 for all i G N and k G K. If Yj =1, then we get a knapsack problem with 
NK items, one for each retailer/product pair. Let Z 3 2 be the objective function 
value when Yj is set to 1 and the resulting knapsack problem is solved. After 
having solved each of these, let tt be a permutation of the numbers 1, 2, . . . , J such 
that 

^2 (1) < Z^ (2) < ••• < 

The optimal solution to P 2 is to choose the W smallest values: 

w 

z 2 = Y J Zl {j) . 

3 = 1 

For fixed vectors A and 0, the lower bound is 

I K 

Z\ ,q — Z\ T Z 2 T A ik- 

2=1 k= 1 

To maximize this bound, that is, 



ma x{Z x ,e}, 

A ,0 

we again use the subgradient optimization procedure. 

Upper Bounds 

At each iteration of the subgradient procedure, we attempt to construct a fea- 
sible solution to the problem. Consider Problem P 2 . Its solution may have a re- 
tailer/product combination assigned to several warehouses. We determine the set 
of retailer/product combinations that are assigned to one and only one retailer and 
fix these. Other retailer/product combinations are assigned to warehouses using 
the following mechanism. For each retailer/product combination, we determine 
the cost of assigning it to a particular warehouse. After determining that this as- 
signment is feasible (from a warehouse-capacity point of view), we calculate the 
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assignment cost as the cost of shipping all of the demand for this retailer /product 
combination through the warehouse plus the cost of shipping the demand from the 
plants to the warehouse (along one or more arcs from the warehouse to the plants). 
For each retailer/product combination, we determine the penalty associated with 
assigning the shipment to its second-best warehouse instead of its best warehouse. 
We then assign the retailer/product combination with the highest such penalty 
and update all arc flows and remaining capacities. We continue in this manner 
until all retailer /product combinations have been assigned to warehouses. 

Computational results for this problem appear at the end of Chap. 20 . 



15.5 The Structure of the Asymptotic Optimal Solution 

In this section, we describe a region-partitioning scheme to solve large instances 
of the CFLP. 

Assume there are n retailers located at points {aq, aq, . . . , x n }. Each retailer 
also serves as a potential site for a warehouse of fixed capacity q. The fixed cost 
of locating a warehouse at a site is assumed to be proportional to the distance the 
site is from a manufacturing facility located at xo, which is assumed (without loss 
of generality) to be the origin (0, 0). Retailer i has a demand Wi, which is assumed 
to be less than or equal to q. Without loss of generality, we assume <7 = 1, and 
therefore, Wi E [0, 1] for each i E N. Let a be the per-unit cost of transportation 
between warehouses and the manufacturing facility, and let (3 be the per-unit cost 
of transportation between warehouses and retailers. 

We assume the retailer locations are independently and identically distributed 
in a compact region A C M 2 according to some distribution /i. Assume the retailer 
demands are independently and identically distributed according to a probability 
measure on [0, 1]. The bin-packing constant associated with the distribution 
(denoted by 7 ^ or simply 7 ) is the asymptotic number of bins used per item in an 
optimal packing of the retailer demands into unit size bins, when items are drawn 
randomly from the distribution </> (see Sect. 5.2). 

The following theorem shows that if the retailer locations and demand sizes 
are random (from a general class of distributions), then as the problem size in- 
creases, the optimal solution has a very particular structure. This structure can 
be exploited using a region-partitioning scheme, as demonstrated below. 

Theorem 15.5.1 Let Xk, k = 1, 2, . . . , n, be a sequence of independent random 
variables having a distribution /a with compact support in 1R 2 . Let \\x\\ be the Eu- 
clidean distance between the manufacturing facility and the point x E M 2 , and 
let 



E(d) = J \\x\\dfj,(x). 
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Let the demands Wk, k = 1, 2, . . . , n, be a sequence of independent random variables 
having a distribution with bin-packing constant equal to 7 . Then, almost surely, 

lim —Z* = a^E(d). 

n— >-oo n 

This analysis demonstrates that simple approaches that consider only the ge- 
ography and the packing of the demands can be very efficient on large problem 
instances. Asymptotically, this is, in fact, the optimal strategy. This analysis also 
demonstrates that, asymptotically, the cost of transportation between retailers and 
warehouses becomes a very small fraction (eventually zero) of the total cost. 



15.6 Exercises 



Exercise 15.1. In the distribution system design problem, explain how the so- 
lution methodology changes when there is a fixed number of possible warehouse 
capacities. For example, at each site, if we decide to install a warehouse, we can 
install a small , medium , or large one. 

Exercise 15.2. Prove Theorem 15.5.1. 

Exercise 15.3. Show how any instance of the bin-packing problem (see Part I) 
can be formulated as an instance of the single-source CFLP. 

Exercise 15.4. Consider Problem Pi of Sect. 15.4. 

(а) Show that this formulation can be strengthened by adding the constraints: 

L K 

EE SkUijk < qj, Vj G J. 

1=1 k = 1 

( б ) Show that this new formulation can be transformed to a specialized kind of 
linear program called a transportation problem. 

(c) Why might we not want to use this stronger formulation? 

Exercise 15.5. (Mirchandani and Francis 1990) Define the uncapacitated facility 
location problem (UFLP) in the following way. Let Fj be the fixed charge of 
opening a facility at site j , for j = 1 , 2 , . . . , m. 

n rri m 

Problem UFLP : Min EE Cij X tJ E FjZj 

i= 1 3 = 1 3 = 1 

m 

s.t. Xij = 1, \/i G N , 

3 = 1 
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Xij < Yj, Mi G N, j G M, 
Xij.Yj G {0, 1}, MieN , jeM. 



Show that UFLP is ATF-Hard by showing that any instance of the ATF-Hardnode 
cover problem can be formulated as an instance of UFLP. The node cover problem 
is defined as follows: Given a graph G and an integer k : does there exist a subset 
of k nodes of G that cover all the arcs of G? (Node v is said to cover arc e if v is 
an endpoint of e.) 

Exercise 15.6. (Mirchandani and Francis 1990) It appears that the p -median 
problem can be solved by solving the resulting problem UFLP (see Exercise 15.5) 
for different values of F = Fj , Mj, until a value F* is found where the UFLP opens 
exactly p facilities. Show that this method does not work by giving an instance 
of a 2-median problem for which no value of F provides an optimal solution to 
UFCLP with two open facilities. 



Part IV 



Vehicle Routing Models 



16 

The Capacitated VRP with Equal 
Demands 



16.1 Introduction 

A large part of many logistics systems involves the management of a fleet of vehicles 
used to serve warehouses, retailers, and/or customers. In order to control the costs 
of operating the fleet, a dispatcher must continuously make decisions on how much 
to load on each vehicle and where to send it. These types of problems fall under 
the general class of vehicle routing problems mentioned in Chap. 1. 

The most basic vehicle routing problem (VRP) is the single-depot capacitated 
vehicle routing problem (CVRP). It can be described as follows: A set of customers 
has to be served by a fleet of identical vehicles of limited capacity. The vehicles are 
initially located at a given depot. The objective is to find a set of routes for the 
vehicles of minimal total length. Each route begins at the depot, visits a subset of 
the customers, and returns to the depot without violating the capacity constraint. 

Consider the following scenario. A customer requests w units of product. If we 
allow this load to be split between more than one vehicle (i.e., the customer gets 
several deliveries, which together sum up to the total load requested), then we can 
view the demand for w units as w different customers each requesting one unit of 
product located at the same point. The capacity constraint can then be viewed 
as simply the maximum number of customers (in this new problem) that can be 
visited by a single vehicle. This is the capacity Q > 1. Therefore, if we allow this 
splitting of demands, and this may not be a desirable property (we investigate the 
unsplit-demand case in Chap. IT), there is no loss in generality in assuming that 



D. Simchi-Levi et al., The Logic of Logistics: Theory, Algorithms, and Applications 301 

for Logistics Management , Springer Series in Operations Research and Financial Engineering, 
DOI 10 . 1007 / 978 - 1 - 4614 - 9149 - 1 — 16 , © Springer Science+Business Media New York 2014 



302 



16. The Capacitated VRP with Equal Demands 



each customer has the same demand, namely, one unit, and the vehicle can visit 
at most Q of these customers on a route. Therefore, this model is sometimes called 
the CVRP with splittable demands or the ECVRP. 

We denote the depot by xq and the set of customers by N = {xi,£ 2 , . . . ,x n }. 
The set Nq = N U {xo} designates all customers and the depot. The customers 
and the depot are represented by a set of nodes on an undirected graph G = 
(Nq,E). We denote by di the distance between customer i and the depot, by 
dmax = ma Xi e jsf di the distance from the depot to the furthest customer, and by 
dij the distance between customer i and customer j. The distance matrix {dij} is 
assumed to be symmetric and to satisfy the triangle inequality; that is, dij = dji 
for all i, j and dij < dik + dkj for all i,k,j. We denote the optimal solution value 
of the CVRP by Z* and the solution provided by a heuristic H by Z H . 

In what follows, the optimal traveling salesman tour plays an important role. 
So, for any set S C iVo, let L*(S) be the length of the optimal traveling salesman 
tour through the set of points S. Also, let L a (S ) be the length of an (^-optimal 
traveling salesman tour through 5, that is, one whose length is bounded from 
above by aL*(S ), a > 1. 

The graph depicted in Fig. 16.1, which is denoted by Q(t,s), also plays an im- 
portant role in our worst-case analyses. It consists of 8 groups of Q nodes and 
another 8 — 1 nodes, called white nodes , separating the groups. The nodes within 
the same group have zero interdistance and each group is connected to the depot 
by an arc of unit length. The white nodes are of zero distance apart and t units’ 
distance away from the depot. Each white node is connected to the two groups 
of nodes it separates by an arc of unit length. Note that when 0 < t < 2, Q(t, s) 
satisfies the triangle inequality [if an edge (i,j) is not shown in the graph, then the 
distance between node i and node j is defined as the length of the shortest path 
from i to j]. Also note that whenever 0 < t < 2, the tour depicted in Fig. 16.2 is 
an optimal traveling salesman tour of length 28. 




FIGURE 16.1. Every group contains Q customers with interdistance zero 
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In this chapter, we analyze this problem using the two tools developed earlier, 
worst-case and average-case analyses. Later, in Chap. 17, we will analyze a more 
general model of the CVRP. 



16.2 Worst-Case Analysis of Heuristics 

A simple heuristic for the CVRP, suggested by Haimovich and Rinnooy Kan (1985) 
and later modified by Altinkemer and Gavish (1990), is to partition a traveling 
salesman tour into segments, such that each segment of customers is served by a 
single vehicle; that is, each segment has no more than Q points. The heuristic, 
called the iterated tour-partitioning (ITP) heuristic, starts from a traveling sales- 
man tour through all n = \N\ customers and the depot. Starting at the depot 
and following the tour in an arbitrary orientation, the customers and the depot 
are numbered , x^\ x^ 2 \ . . . , x^ n \ where x^ is the depot. We partition the 
path from x W to x ^ into |~^] (or |~0~| + 1) disjoint segments, such that each one 
contains no more than Q customers, and connect the endpoints of each segment to 
the depot. The first segment contains only customer x^\ All the other segments 
contain exactly Q customers, except maybe the last one. This defines one feasible 
solution to the problem. We can repeat the above construction by shifting the end- 
points of all but the first and last segments up by one position in the direction of 
the orientation. This can be repeated Q — 1 times, producing a total of Q different 
solutions. We then choose the best of the set of Q solutions generated. 

It is easy to see that for a given traveling salesman tour, the running time of 
the ITP heuristic is 0(nQ). The performance of this heuristic clearly depends 
on the quality of the initial traveling salesman tour chosen in the first step of 
the algorithm. Hence, when the ITP heuristic partitions an (^-optimal traveling 
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salesman tour, it is denoted ITP(a'). To establish the worst-case behavior of the 
algorithm, we first find a lower bound on Z* and then calculate an upper bound 
on the cost of the solution produced by the ITP(<a) heuristic. 

Lemma 16.2.1 Z* > max{L*(jV 0 ), ^ Eiejv d i\- 



Proof. Clearly, Z* > L*(Nq) by the triangle inequality. To prove Z* > ^ J2ieN d > > 
consider an optimal solution in which N is partitioned into subsets {TVi, . . . , 
AT m }, where each set Nj is served by a single vehicle. Clearly, 



Z* 



Y^ L *(NjU{x 0 }) 



2 

Wi\ 



Z di 

i£Nj 




Lemma 16.2.2 Z ITP (“) < § £ l6iv d t + (1 - ±)aL*{N 0 ). 



Proof. We prove the lemma by finding the cumulative length of the Q solutions 
generated by the ITP heuristic. The ith solution consists of the segments 



{x 



( 1 ) ^( 2 ) 



(i) 






(*+!) 



{ x (i+ 1 +l ]1 Q 1 \Q) 



(n) 



}• 



Thus, among the Q solutions generated, each customer x^\ 2 < i < n — 1 
appears exactly once as the first point of a segment and exactly once as the last 
point. Therefore, in the cumulative length of the Q solutions, the term 2 d x o) is 
incurred for each i, 2 < i < n — 1. Customer x W is the first point of a segment in 
each of the Q solutions, and in the first one it is also the last point. Thus, the term 
d x ( i) appears Q + 1 times in the cumulative length. Similarly, x ^ is always the 
last point of a segment in each of the Q solutions, and once the first point. Thus, 
the term d x (n) appears Q + 1 times in the cumulative length as well. Finally, each 
one of the arcs {x^ l \x^ l+1 ^) for 1 < i < n — 1 appears in exactly Q — 1 solutions 
since it is excluded from only one solution. These arcs, together with the Q — 1 
arcs connecting the depot to x ^ and Q — 1 arcs connecting the depot to x^ n \ 
form Q — 1 copies of the initial traveling salesman tour selected in the first step 
of the heuristic. Thus, if the initial traveling salesman tour is an (^-optimal tour, 
the cumulative length of all Q tours is 



2^d i + (Q- 1 )L“(iV„) 

ieN 

<2^di + (Q-l)aL*{N 0 ). 

ieN 
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Hence, 



z ITP(a) < ^ ^ di + (1 



^L*(N 0 ). 



Combining upper and lower bounds, we obtain the following result. 

Theorem 16.2.3 



z ITp («) 



< 1 + 



K) 



(16.1) 



For example, if Christofides’ polynomial-time heuristic (a = 1.5) is used to 
obtain the initial traveling salesman tour, we have 



z ITp (i.5) 

Z* 




3 

2 Q' 



The proof of the worst-case result for the ITP(<a) heuristic suggests that if we 
can improve the bound in (16.1) for a = 1, then the bound can be improved for 
any a > 1. However, the following theorem, proved by Li and Simchi-Levi (1990), 
says that this is impossible; that is, the bound 



is sharp. 



Z ITP(1) 

Z* 



- 2 0 



Theorem 16.2.4 For any integer Q > 1, there exists a problem instance with 
Z ITP(1)/ Z . = 2 - i. 



Proof. Let us consider the graph Q{ 0, q). A solution obtained by the ITP heuristic 
is shown in Fig. 16.3. In this solution, 

= 2 + 2 + 4 + 4+ -- -+ 4+2 = 4Q-2. 

Q — 2 times 

One can construct a solution that has Q vehicles serve the Q groups of customers 
and the (Q + l)st vehicle serve the other Q — 1 nodes. Thus, 

Z* < 2 Q. 



Hence, 

Z ITp (i) i 

>2 . 

Z* ~ Q 

This, together with the upper bound of (16.1), completes the proof. I 

Another variant of the tour-partitioning heuristic is the optimal partitioning 
(OP) heuristic described by Beasley (1983). The algorithm takes a traveling sales- 
man tour and optimally partitions it into a set of feasible routes; that is, each 
route contains at most Q customers. 
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Given a traveling salesman tour through the customers and the depot, the heuris- 
tic numbers the points , x W , . . . , x^ in order of appearance on the tour, where 
is the depot. Define 

! the distance traveled by a vehicle that starts at x (°) visits, 

customers 0?^+!) ? #C+ 2 ) , . . . , x and returns to , if k — j < Q; 

oo, otherwise. 



If we find the shortest path from x^ to x ^ in the acyclic graph [with nodes x ^ , 
0 < i < n, and arcs {x^\x^) for 0 < i < j < n], where the distance between x^ 
and x W is Cjk, we will have an optimal partition of the traveling salesman tour 
into feasible routes. For example, if the shortest path from to x ^ is — >> 
x^ x ^ x^ n \ then three tours are formed, namely, {x^\x^\ . . . , x^\x^)^ 
^.(t+i) ^.(t+ 2 ) x^ u ^ ^(0^ and ^O+i) ^0+2) x^ n ^ ^(0^ 

For a given traveling salesman tour, the above shortest-path problem can be 
solved in 0(nQ) time, including the time required to evaluate the costs Cjk- 
When the OP heuristic partitions an o-optimal traveling salesman tour, it is 
denoted OP (a). The partitions considered by the OP (ck) heuristic include all Q 
of the partitions generated by the ITP (a) heuristic. Therefore, Z op < Z ITP (“), 
and hence, its worst-case bound is at least as good; that is, 



Z OP(a) 

z* 






The next theorem implies that for a = 1, this bound is asymptotically sharp; that 
is, Z OP ^ /Z* tends to 2 when Q approaches infinity. 



Theorem 16.2.5 For any integer Q > 1, there exists a problem instance with 
Z OP ( l ) / Z* arbitrarily close to 2 — 
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Proof. Consider the graph Q(\,Kq + 1), where K is a positive integer. It is easy 
to check that 

Z 0P(1) = 2 (KQ + 1) + 2 KQ. 

On the other hand, consider the solution in which KQp 1 vehicles serve the KQ + 1 
groups of customers and another K vehicles serve the other nodes. Hence, 

Z* <2(KQ+1) + 2K, 



and therefore, 



lim 

K—t oo 



z op (i) 

z* 



> 2 - 



2 

Q + 1 



16.3 The Asymptotic Optimal Solution Value 



In this and the following section, we assume that the customers are points in 
the plane and that the distance between any pair of customers is given by the 
Euclidean distance. Assume, without loss of generality, that the depot is the point 
(0,0) and ||x|| designates the distance from the depot to the point x E M 2 . The 
results discussed in this section and the next are mainly based on Haimovich and 
Rinnooy Kan (1985). 

The upper bound of Lemma 16.2.2 has two cost components; the first component 
is proportional to the total “radial” cost between the depot and the customers. 
The second component is proportional to the “circular” cost: the cost of traveling 
between customers. This cost is related to the cost of the optimal traveling sales- 
man tour. As discussed in Chap. 2, for large n, the cost of the optimal traveling 
salesman tour grows like y/n, while the total radial cost between the depot and 
the customers grows like n. Therefore, it is intuitive that when the number of cus- 
tomers is large enough, the first cost component will dominate the second. This 
observation is now formally proven. 



Theorem 16.3.1 Let Xk, k = 1, 2, . . . , n, be a sequence of independent random 
variables having a distribution fi with compact support in 1R 2 . Let 



Then, with probability 1, 




x\\ 



Z* 

lim — = 

n— )• oo n 






Proof Lemma 16.2.1 and the strong law of large numbers tell us that 

Z * 2 

lim — > — E(d ) (a.s.). 

oo n - Q w v ; 



(16.2) 
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On the other hand, from Lemma 16.2.2, 

n n n(J \ Q J n 

^ ieN ^ 

From Chap. 5, we know that there exists a constant /3 > 0, independent of the 
distribution //, such that with probability 1, 



lim 

n— >• oo 



L*(N, q) 
\fn 




f 1/2 (x)dx, 



where / is the density of the absolutely continuous part of the distribution fi. 
Hence, 

Zi* 2 

lim — < —E(d) (a.s.). 

oo n ~ Q y J y J 

This together with (16.2) proves the theorem. I 

The following observation is in order. Haimovich and Rinnooy Kan prove The- 
orem 16.3.1 merely assuming E(d) is finite rather than the stronger assumption 
of a compact support. However, the restriction to a compact support seems to be 
satisfactory for all practical purposes. The following is another important general- 
ization of Theorem 16.3.1. Assume that a cluster of Wk customers (rather than a 
single customer) is located at point Xk, k = 1, 2, . . . , n. The theorem then becomes 



lim — = f E(w)E{d ), (16.3) 

n— >• oo n Q 

where E{w) is the expected cluster size, provided that the cluster size is indepen- 
dent of the location. This follows from a straightforward adaptation of 
Lemmas 16.2.1 and 16.2.2. 



16.4 Asymptotically Optimal Heuristics 

The proof of the previous theorem (Theorem 16.3.1) reveals that the ITP(ct) 
heuristic provides a solution whose cost approaches the optimal cost when n tends 
to infinity. Indeed, replacing ITP(l) by ITP(ct) in the previous proof gives the 
following theorem. 

Theorem 16.4.1 Under the conditions of Theorem 16.3.1 and for any fixed a > 1, 
the ITP(a ) heuristic is asymptotically optimal. 

As is pointed out by Haimovich and Rinnooy Kan (1985), iterated tour- 
partitioning heuristics, although asymptotically optimal, hardly exploit the spe- 
cial topological structure of the Euclidean plane in which the points are located. It 
is therefore natural to consider region-partitioning (RP) heuristics that are more 
geometric in nature. 
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Haimovich and Rinnooy Kan consider three classes of regional partitioning 
schemes. In rectangular region partitioning (RRP), one starts with a rectangle 
containing the set of customers N and cuts it into smaller rectangles. In polar 
region partitioning (PRP) and circular region partitioning (CRP), one starts with 
a circle centered at the depot and partitions it by means of circular arcs and radial 
lines. We shall shortly discuss each one of these in detail. 

In each case, the RP heuristics construct subregions of the plane, where subre- 
gion j contains a set of customers N(j). These subregions are constructed so that 
each one of them has exactly Q customers except possibly one. 

Since every subset N(j) has no more than Q customers, each of these RP heuris- 
tics allocates one vehicle to each subregion. The vehicles then use the following 
routing strategy. The first customer visited is the one closest to the depot among 
all the customers in N(j). The rest are visited in the order of an (^-optimal travel- 
ing salesman tour through N(j). After visiting all the customers in the subregion, 
the vehicle returns to the depot through the first (closest) customer. It is therefore 
natural to call these heuristics RP(a) heuristics. In particular, we have RRP (a), 
PRP (a), and CRP (a). 

Lemma 16.4.2 Z PF(a) < ^ <k + 2 d max + a L*{N(j)). 



Proof. We number the subsets N(j) constructed by the RP(a) heuristic so that 
|7V(j)| = Q for every j > 2 and |7V(1)| < Q. It follows that the total distance 
traveled by the vehicle that visits subset 7V(j), for j > 2, is 

<2 min di + aL*(N(j)) 

ieN(j) 

-7) P, di+aL*(N{j)), 

W ieN(j) 

while the total distance traveled by the vehicle that visits N(l) is no more than 

2d max + aL*(N(l)). 

Taking the sum over all subregions, we obtain the desired result. I 

The quality of the upper bound of Lemma 16.4.2 depends, of course, on the 
quantity This value was analyzed in Chap. 5, where it was shown 

that for any RP heuristic, 

E < L*(N) + ^P RP , (16.4) 

j 

where P RP is the sum of perimeters of the subregions generated by the RP heuris- 
tic. For this reason, we analyze the quantity P RP in each of the three region- 
partitioning heuristics. 
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Rectangular Region Partitioning 

This heuristic is identical to the one introduced for the traveling salesman prob- 
lem in Sect. 5.3. The smallest rectangle with sides a and b containing the set of 
customers N is partitioned with horizontal and vertical lines. First, the region is 
subdivided by t vertical lines such that each subregion contains exactly (ft + 1)Q 
points except possibly the last one. Each of these t + 1 subregions is then parti- 
tioned with ft horizontal lines into ft + 1 smaller subregions such that each contains 
exactly Q points except possibly for the last one. 

As before, ft and t should satisfy 



(ft + 1 )Q 



and 

t(h + 1 )Q < ti < (t H- 1 )(ft + 1 )Q- 

The unique integer that satisfies these conditions is ft = |~ — 1] . Note that the 
number of vertical lines added is t < , and each of these lines is counted twice 

in the quantity P RRP . 

In the second step of the RRP, we add ft horizontal lines, where ft < • These 

horizontal lines are also counted twice in P RRP . It follows that 



->RRP 



< 2 , 



■ (d T &) T 2 (u T 6) ^ Sd n 



8cL 



Polar Region Partitioning 

The circle with radius d max containing the set N and centered at the depot is 
partitioned in exactly the same way as in the previous partitioning scheme, with 
the exception that circular arcs and radial lines replace vertical and horizontal 
lines. Using the same analysis, one can show 



;>PRP 



A 67rd max 




(16.5) 



Circular Region Partitioning 

This scheme partitions the circle centered at the depot with radius d max into h 
equal sectors, where h is to be determined. Each sector is then partitioned into 
subregions via circular arcs, such that each subregion contains exactly Q customers 
except possibly the one closest to the depot. Thus, at most h subregions, each from 
one sector, have less than Q customers. These subregions (with the depot on their 
boundary) are then repartitioned with radial cuts such that at most ft, — 1 of them 
have exactly Q customers each except for possibly the last one. 
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The total length of the initial radial lines is hd max . The length of an inner 
circular arc bounding a subregion containing a set N(j) is no more than 



27 r . 27 r J2ieN(j) d% ^7r J2ieN(j) di 

h ieN(j) 1 - h I JVC?) I hjQ 

while the length of the outer circle is 27rd max . Finally, the repartitioning of the 
central subregions adds no more than . Thus, 



P CRP < 2(hd max • 



^EieNdi hd m ax\ 

~hQ + —)+ 2 ’ rd ™- 



Taking h = 



dj 

SQ djxiax 



we obtain the following upper bound on p CRP : 



pCRP < 4J37rd max — + (3 + 2n )d max . 



ieiv 



The reader should be aware that all of these partitioning schemes can be im- 
plemented in O(nlogn) time. We now have all the necessary ingredients for an 
asymptotic analysis of the performance of these partitioning heuristics. 

Theorem 16.4.3 Under the conditions of Theorem 16.3.1 and for any fixed a > 
l, RRP(a), PRP(a), and CRP{a) are asymptotically optimal. 



Proof. Lemma 16.4.2, together with (16.4), provides the following upper bound on 
the total distance traveled by all vehicles in the solution produced by the above 
RP heuristics: 



z R p (a) <2 y^di+ 2 d max + aL*(N) + \aP RP . 

^ i£N 

By the strong law of large numbers and the fact that the distribution has compact 
support, y h ^Z ieN di converges almost surely to E(d), while converges almost 

surely to 0. Furthermore, L ^ converges to 0 almost surely; see the proof of The- 
orem 16.3.1. Finally, from the analysis of each of the region-partitioning heuristics 
and the fact that the points are in a compact region, converges almost surely 
to zero as well. I 

In conclusion, we see that the CVRP with equal demands is asymptotically solv- 
able via several different region-partitioning schemes. In fact, since each customer 
has the same demand, the packing of the customers’ demands into the vehicles 
is a trivial problem. Any Q customers can fit. The more difficult problem, when 
demands are of different sizes, presents complicating bin-packing features, which 
will prove to be more difficult. 
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16.5 Exercises 



Exercise 16.1. Consider the following version of the capacitated vehicle routing 
problem (CVRP). You are given a network G = (V,A) with positive arc lengths. 
Assume that E C A is a given set of edges that have to be “covered” by vehicles. 
The vehicles are initially located at a depot p G V. Each vehicle has a “capacity” 
g; that is, each vehicle can cover no more than q edges from E. Once a vehicle 
starts an edge in E , it has to cover all of it. The objective is to design tours for 
vehicles so that all edges in E are covered, vehicles’ capacities are not violated, 
and the total distance traveled is as small as possible. 

(a) Suppose we want first to find a single tour that starts at the depot p, traverses 
all edges in E, and ends at p and whose total cost (length) is as small as 
possible. Generalize Christofides’ heuristic for this case. 

( b ) Consider now the version of the CVRP described above and suggest two 
possible lower bounds on the optimal cost of the CVRP. 

(c) Describe a heuristic algorithm based on a tour-partitioning approach using, 
as the initial tour, the tour you found in part (a). What is the worst-case 
bound of your algorithm? 

Exercise 16.2. Derive (16.3). 

Exercise 16.3. Consider an n-customer instance of the CVRP with equal de- 
mands. Assume there are m depots and at each depot is an unlimited number of 
vehicles of limited capacity. Suggest an asymptotically optimal region-partitioning 
scheme for this case. 

Exercise 16.4. Consider an n-customer instance of the CVRP with equal de- 
mands. There are K customer types: A customer is of type k with independent 
probability Pk > 0. Customers of different types cannot be served together in the 
same vehicle. Devise an asymptotically optimal heuristic for this problem. If K is 
a function of n, what conditions on K(n) are necessary to ensure that this same 
heuristic is asymptotically optimal? 



Exercise 16.5. Derive (16.5). 



17 

The Capacitated VRP with Unequal 
Demands 



17.1 Introduction 

In this chapter, we consider the capacitated vehicle routing problem with unequal 
demands (UCVRP). In this version of the problem, each customer i has a demand 
Wi and the capacity constraint stipulates that the total amount delivered by a 
single vehicle cannot exceed Q. We let Z* denote the optimal solution value of 
UCVRP, that is, the minimal total distance traveled by all vehicles. 

In this version of the problem, the demand of a customer cannot be split over sev- 
eral vehicles; that is, each customer must be served by a single vehicle. This, more 
general, version of the model is sometimes called the CVRP with unsplit demands. 
The version where demands may be split is dealt with in Chap. 16. Splitting a 
customer’s demand is often physically impossible or managerially undesirable due 
to customer service or accounting considerations. 



17.2 Heuristics for the CVRP 

A great deal of work has been devoted to the development of heuristics for the 
UCVRP; see, for example, Christofides (1985), Fisher (1995), and Federgruen and 
Simchi-Levi (1995), or Bertsimas and Simchi-Levi (1996). Following Christofides, 
we classify these heuristics into the four categories: 

• constructive methods; 

• route-first-cluster-second methods; 
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• cluster-first-route-second methods; 

• incomplete optimization methods. 

We will describe the main characteristics of each of these classes and give exa- 
mples of heuristics that fall into each. 

Constructive Methods 

The savings algorithm suggested by Clarke and Wright (1964) is the most imp- 
ortant member of this class. This heuristic, which is the basis for a number of 
commercial vehicle routing packages, is one of the earliest heuristics designed for 
this problem and, without a doubt, the most widely known. The idea of the savings 
algorithm is very simple: Consider the depot and n demand points. Suppose that 
initially we assign a separate vehicle to each demand point. The total distance 
traveled by a vehicle that visits demand point i is 2 di, where di is the distance 
from the depot to demand point i. Therefore, the total distance traveled in this 
solution is 2 Y^i=i di- 

If we now combine two routes, say we serve i and j on a single trip (with the 
same vehicle), the total distance traveled by this vehicle is di + dij + dj, where dij 
is the distance between demand points i and j. Thus, the savings obtained from 
combining demand points i and j, denoted s^-, is 

Sij = 2 di + 2dj — ( di + dj + dij ) = di + dj — dij. 

The larger the savings Sij, the more desirable it is to combine demand points i 
and j. Based on this idea, Clarke and Wright suggest the following algorithm. 

The Savings Algorithm 

Step 1: Start with the solution that has each customer visited by a separate 
vehicle. 

Step 2: Calculate the savings s^ = doi + djo — dij > 0 for all pairs of customers i 
and j. 

Step 3: Sort the savings in nonincreasing order. 

Step 4 : Find the first feasible arc (i, j) in the savings list, where 

(1) i and j are on different routes, 

(2) both i and j are either the first or last visited on their 

respective routes, and 

(3) the sum of demands of routes i and j is no more than Q. 

Add arc (z, j) to the current solution and delete arcs (0, i) and (j, 0). Delete 
arc (z, j) from the savings list. 

Step 5: Repeat step 4 until no more arcs satisfy the conditions. 
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Additional constraints, which might be present, can easily be incorporated into 
step 4. Usually, a simple check can be performed to see whether combining the 
tours containing i and j violates any of these constraints. 

Other examples of heuristics that fall into this class are the heuristics of Gaskel 
(1967), Yellow (1970), and Russell (1977). In particular, the first two are modifi- 
cations of the savings algorithm. 

Route-First— Cluster-Second Methods 

Traditionally, this class has been defined as follows. The class consists of those 
heuristics that first construct a traveling salesman tour through all the customers 
(route first) and then partition the tour into segments (cluster second). One vehicle 
is assigned to each segment and visits the customers according to their appearance 
on the traveling salesman tour. 

As we shall see in the next section, some strong statements can be made about 
the performance of this class’s heuristics. For this purpose, we give a more precise 
definition of the class here. 

Definition 17.2.1 A heuristic is a route- first-cluster- second heuristic if it first 
orders the customers according to their locations, disregarding demand sizes, and 
then partitions this ordering to produce feasible clusters. These clusters consist of 
sets of customers that are consecutive in the initial order. Customers are then 
routed within their cluster depending on the specific heuristic. 

This definition of the class is more general than the traditional definition given 
above. The disadvantage of this class, of which we will give a rigorous analysis, 
can be highlighted by the following simple example. Consider a routing strategy 
that orders the demands in such a way that the sequence of demand sizes in the 
order is (9, 2, 9, 2, 9, 2, 9, 2, . . .). If the vehicle capacity is 10, then any partition of 
this tour must assign one vehicle to each customer. This solution would consist of 
half of the vehicles going to pick up two units (using 20 % of the vehicle capacity) 
and returning to the depot, not a very efficient strategy. By contrast, a routing 
strategy that looks at the demands at the same time as it looks at customer 
locations would clearly find a more intelligent ordering of the customers: one that 
sequences demands efficiently to decrease total distance traveled. 

The route-first-cluster-second class includes classical heuristics such as the opti- 
mal partitioning heuristic introduced by Beasley (1983), and the sweep algorithm 
suggested by Gillett and Miller (1974). 

In the optimal partitioning heuristic, one tries to find an optimal traveling sales- 
man tour, or, if this is not possible, a tour that is close to optimal. This provides 
the initial ordering of the demand points. The ordering is then partitioned in an 
efficient way into segments. This step can be done by formulating a shortest-path 
problem. See Sect. 16.2 for details. 

In the sweep algorithm, an arbitrary demand point is selected as the starting 
point. The other customers are ordered according to the angle made among them, 
the depot, and the starting point. Demands are then assigned to vehicles following 
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this initial order. In effect, the points are “swept” in a clockwise direction around 
the depot and assigned to vehicles. Then efficient routes are designed for each 
vehicle. Specifically, the sweep algorithm is the following. 

The Sweep Algorithm 

Step 1: Calculate the polar coordinates of all customers, where the center is the 
depot and an arbitrary customer is chosen to be at angle 0. Reorder the 
customers so that 

0 = 0i < 0 2 < • • • < 0 n . 

Step 2: Starting from the unrouted customer i with smallest angle 0^, construct 
a new cluster by sweeping consecutive customers i + 1 , i + 2 . . . until the 
capacity constraint will not allow the next customer to be added. 

Step 3: Continue step 2 until all customers are included in a cluster. 

Step 4: For each cluster constructed, solve the TSP on the subset of customers 
and the depot. 

In both of these methods, additional constraints can easily be incorporated into 
the algorithm. 

We note that, traditionally, researchers have classified the sweep algorithm as a 
cluster-first-route-second method and not as a route-first-cluster-second method. 
Our opinion is that the essential part of any vehicle routing algorithm is the clus- 
tering phase of the algorithm, that is, how the customers are clustered into groups 
that can be served by individual vehicles. The specific sequencing within a cluster 
can and, for most problems, should be done once these clusters are determined. 
Therefore, a classification of algorithms for the CVRP should be solely based on 
how the clustering is performed. Thus, the sweep algorithm can be viewed as an 
algorithm of the route-first-cluster-second class since the clustering is performed 
on a fixed ordering of the nodes. 

Cluster-First— Route-Second Methods 

In this class of heuristics, the clustering is the most important phase. Customers 
are first clustered into feasible groups to be served by the same vehicle (cluster 
first) without regard to any preset ordering, and then efficient routes are designed 
for each cluster (route second). 

Heuristics of this class are usually more technically sophisticated than the pre- 
vious class, since determining the clusters is often based on a mathematical pro- 
gramming approach. This class includes the following three heuristics: 

• the two-phase method (Christofides et al. 1978); 

• the generalized assignment heuristic (Fisher and Jaikumar 1981); 

• the location-based heuristic (Bramel and Simchi-Levi 1995). 
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The first two heuristics use, in a first step, the concept of seed customers. 
The seed customers are customers that will be in separate vehicles in the solution 
and around which tours are constructed. In both cases, the performance of the 
algorithm depends highly on the choice of these seeds. Placing the CVRP in the 
framework of a different combinatorial problem, the location-based heuristic sel- 
ects the seeds in an optimal way and creates, at the same time, tours around these 
seeds. Thus, instead of decomposing the process into two steps, as is done in the 
two-phase method and the generalized assignment heuristic, the location-based 
heuristic simultaneously picks the seeds and designs tours around them. We will 
discuss this heuristic in detail in Sect. 17.7. 

Incomplete Optimization Methods 

These methods are optimization algorithms that, due to the prohibitive com- 
puting time involved in reaching an optimal solution, are terminated prematurely. 
Examples of these include 

• cutting-plane methods (Cornuejols and Harche 1993), 

• minimum K-tree methods (Fisher 1994). 

The disadvantage of incomplete optimization methods is that they still require 
large amounts of processing time, and they can handle problems with usually no 
more than 100 customers. 



17.3 Worst-Case Analysis of Heuristics 

In the worst-case analysis presented here, we assume that the customer demands 
n;i, 7C2, . . • , w n and the vehicle capacity Q are rationals. Hence, without loss of 
generality, we assume that Q and are integers. Furthermore, we may assume 
that Q is even; otherwise, one can double Q as well as each rcy i = 1, 2, . . . ,n, 
without affecting the problem. The following two-phase route-first-cluster-second 
heuristic was suggested by Altinkemer and Gavish (1987). In the first phase, we 
relax the requirement that the demand of a customer cannot be split. Each cus- 
tomer i is replaced by u^-unit demand points that are zero distance apart. We then 
apply the ITP(a') heuristic (see Sect. 16.3) using a vehicle capacity of In the 
second phase, we convert the solution obtained in Phase I to a feasible solution to 
the original problem without increasing the total cost. This heuristic is called the 
unequal- weight iterated tour-partitioning [UITP(a)] heuristic. 

We now describe the second-phase procedure. Our notation follows the one sug- 
gested by Haimovich et al. (1988). Let m = ^2 ieN Wi be the number of demand 
points in the expanded problem. Recall that in the first phase an arbitrary orienta- 
tion of the tour is chosen. The customers are then numbered , x W , x ^ , . . . , x ^ , 

in order of their appearance on the tour, where is the depot. The ITP(a') 
heuristic partitions the path from x^ to x ^ into (or [^y] + 1) disjoint 
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segments such that each one contains no more than ^ demand points and connects 
the endpoints of each segment to the depot. The segments are indexed by j = 
1,2,..., such that the first customer of the j th segment is an d th e 

last customer is x^ ej \ Hence, the jth segment, denoted by Sj , includes customers 
{x^ bj \ • • • , x^}. Obviously, if x^ = x^j+i) f or SO me j, then the demand of cus- 
tomer x^ is split between the jth and ( j + l)th segments; therefore, these are 
not feasible routes. On the other hand, if ^ x ( b J+P for all j, then the set of 
routes is feasible. 

We now transform the solution obtained in the first phase into a feasible solution 
without increasing the total distance traveled. We use the following procedure. 

The Phase 2 Procedure 

Step 1: Set S' = 0, for j = 1, 2, . . . , [^1 • 

Step 2: For j = 1 to — 1, do 
If x^ e ^ = x( bj+1 \ then 

If Y^i=bj w xV) < Q-> th en ^ Sj = {^ bj \ • • • , x^} and 
let = x ( b j+ 1+ 1 ); 

else let Sj = {x^ bj \ • • • , and x^ bj+1 ^ = x ^ 

else, let Sj = {x^\ • • • , x^}. 

We argue that the procedure generates feasible sets Sj for j = 1, 2, ... , \^~\ • 
Note that the jth set can be enlarged only in the ( j — l)st and j th iterations 
(if at all). Moreover, if it is enlarged in the jth iteration, it is clearly done feasibly in 
view of the test Y^=bj w x ( i ) < Q- On the other hand, if Sj is enlarged in the (j — l)st 

iteration, at most ® demand points are added, thus ensuring feasibility. This can be 
verified as follows. Assume to the contrary that in the ( j — l)st iteration more than 
® demand points are transferred from S'_ 1 to Sj so that in the ( j — l)st iteration 
x^o- 1 ) = x ( bj \ Since the original set Sj - 1 contains at most ^ demand points, we 
must have shifted demand points in the ( j — 2) nd iteration from Sj - 2 to Sj- 1 [and, 
in particular, x (A-D = x( e .?- 2 )], part of which are now being transferred to Sj. This 
implies that x M = x^~ 2 ^ = x^- 1 ^ = x^ ej -^ = x^ bj \ where ej_ 2 , bj_i, ej_i, and 
bj refer to the original sets and Sj. In other words, at the beginning 

of the ( j — l)st iteration, the set S'j_ 1 contains a single customer x^*\ But then, 
shifting = x (*) backward to Sj_ 1 is feasible, contradicting the fact that more 
than ® demand points need to be shifted forward from S'_ 1 to S'-. Therefore, the 
procedure generates feasible sets and we have the following worst-case bound. 

Theorem 17.3.1 ^ UI Jp (Q ° < 2 + (1 — ^)o. 



Proof. Recall that in the first phase, the vehicle capacity is set to Hence, using 
the bound of Lemma 16.2.2, we obtain the following upper bound on the length 
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of the tours generated in Phase I of the UITP(<a) heuristic: 



4 

Q 



diWi + ^1 

ieN 




)aL*(N 0 ). 



(17.1) 



In the second phase of the algorithm, the tour obtained in the first phase is 
converted into a feasible solution with total length no more than (17.1). To verify 
this, we need only to analyze those segments whose endpoints are modified by the 
procedure. 

Suppose that Sj and Sj differ in their starting point; then Sj must start with 
x (frj+ 1 ) > This implies that arc (x^ bj \x^ bj ^), which is part of the Phase I solution, 
does not appear in the j th route. The triangle inequality ensures that the sum of 
the length of arcs (x^°\x^ bj ^) and (x^ bj \x^ bj ^) is no smaller than the length of 
arc (x(°\ x^ bj+1>) ). A similar argument can be applied if Sj and Sj differ in their 
terminating point. Consequently, for every segment j, for j = 1, 2, . . . , , the 

length of the j th route according to the new partition is no longer than the length 
of the j th route according to the old partition. Hence, 



z un*(«)< 4 ^ ,i 

^ ocz AT 




)aL*(N 0 ). 



Clearly, Z* > Z*, and therefore using the lower bound on Z* developed in 
Lemma 16.2.1 completes the proof. I 

The UITP heuristic was divided into two phases to prove the above worst-case 
result. However, if the optimal partitioning heuristic is used in the unequal- weight 
model, the actual implementation is a one-step process. This is done as follows. 
Given a traveling salesman tour through the set of customers and the depot, we 
number the nodes x^°\ x ^ , . . . , x^ in order of their appearance on the tour, 
where x^ is the depot. We then define a distance matrix with cost Cjk, where 



' the distance traveled by a vehicle that starts 
at x ^ , visits customers x^ +1 ^ , , ... ,x^ k \ 



Cj k — 4 



and returns to x^°\ 



^ EiL, 



3 + 1 1 



r(0 



< Q; 



oo, 



otherwise. 



As in the equal-demand case (see Sect. 16.2), it follows that a shortest path from 
x(°) to x( n ) in the directed graph with distance cost Cj\ z corresponds to an optimal 
partition of the traveling salesman tour. This version of the heuristic, developed 
by Beasley and called the unequal- weight optimal partitioning (UOP) heuristic, 
also has Z UOP ( a )/Z* < 2 + (1 — ^)a. The following theorem, proved by Li and 
Simchi-Levi (1990), implies that when a = 1, this bound is asymptotically tight 
as Q approaches infinity. 

Theorem 17.3.2 For any integer Q > 1 , there exists a problem instance with 
ZUOP(i) jz* ( an g therefore Z mTP ^ /Z*) arbitrarily close to 3 — qt^- 
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Proof. We modify the graph Q( 2 , Kq-\- 1 ), where K is a positive integer, as follows. 
Now, every group — instead of containing Q customers — contains only one customer 
with demand Q. The other KQ customers have unit demand. The optimal traveling 
salesman tour is again as shown in Fig. 16.2, and the solution obtained by the 
UOP(l) heuristic is to have 2 KQ + 1 vehicles, each one of them serving only one 
customer. Thus, 

Z UOp (i) = 2 (KQ + 1) + 4 KQ. 

The optimal solution to this problem has KQ + 1 vehicles serve those customers 
with demand Q,, and K other vehicles serve the unit demand customers. Hence, 



Therefore, 



Z* = 2(KQ+1) + 4K. 



lim 

K^oo 



z UOP(I) 



2 (KQ + 1) + 4 KQ 
2(WQ + 1)+4R: 



= 3- 



6 

Q + 2 



17.4 The Asymptotic Optimal Solution Value 

In the probabilistic analysis of the UCVRP, we assume, without loss of generality, 
that the vehicle’s capacity Q equals 1 , and the demand of each customer is no 
more than 1. Thus, vehicles and demands in a capacitated vehicle routing problem 
correspond to bins and item sizes (respectively) in a bin-packing problem. Hence, 
for every routing instance, there is a unique corresponding bin-packing instance. 

Assume the demands uq, uq, . . . , w n are drawn independently from a distribu- 
tion <f> defined on [0,1]. Assume customer locations are drawn independently from 
a probability measure fi with compact support in 1R 2 . We assume that di > 0 for 
each i E N since customers at the depot can be served at no cost. In this section, 
we find the asymptotic optimal solution value for any <f> and any fi. This is done 
by showing that an asymptotically optimal algorithm for the bin-packing problem, 
with item sizes distributed like <f>, can be used to solve, in an asymptotic sense, 
the UCVRP. 

Given the demands w\ , , . . . , w n , let 6 * be the number of bins used in the 

optimal solution to the corresponding bin-packing problem. As demonstrated in 
Theorem 5.2.4, there exists a constant 7 > 0 (depending only on 4>) such that 

lim — =7 (a.s.). (17.2) 

n— )• 00 71 

We shall refer to the constant 7 as the bin-packing constant and omit the depen- 
dence of 7 on in the notation. 

The following theorem was proved by Simchi-Levi and Bramel (1990). Recall, 
without loss of generality, the depot is positioned at ( 0 , 0 ) and ||x|| represents the 
distance from the point x E 1R 2 to the depot. 
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Theorem 17.4.1 Let Xk, k = 1, 2, . . . , n, be a sequence of independent random 
variables having a distribution fi with compact support in 1R 2 . Let 

E(d) = [ \\x\\ d/i(x). 

Jm 2 

Let the demands Wk, k = 1, 2, . . . , n, be a sequence of independent random variables 
having a distribution <f> with support on [ 0 , 1 ], and assume that the demands and the 
locations of the customers are independent of each other. Let 7 be the bin-packing 
constant associated with the distribution <F; then, almost surely, 

lim - Z* = 2'yE(d). 

n— >-00 n w 

Thus, the theorem fully characterizes the asymptotic optimal solution value of 
the UCVRP, for any reasonable distributions <f> and /x. An interesting observation 
concerns the case where the distribution of the demands allows perfect packing , 
that is, when the wasted space in the bins tends to become a small fraction of the 
number of bins used. Formally, <F is said to allow perfect packing if almost surely 
lim n ^oo 7 ^ = E(w). Karmarkar (1982) proved that a nonincreasing probability 
density function (with some mild regularity conditions) allows perfect packing. 
Rhee (1988) completely characterizes the class of distribution functions <F, which 
allow perfect packing. Clearly, in this case, 7 = E(w). Thus, Theorem 17.4.1 indi- 
cates that allowing the demands to be split or not does not change the asymptotic 
objective function value. That is, the UCVRP and the ECVRP can be said to be 
asymptotically equivalent when allows perfect packing. 

To prove Theorem 17.4.1, we start by presenting in Sect. 17.4.1 a lower bound 
on the optimal objective function value. In Sect. 17.4.2, we present a heuristic for 
the UCVRP based on a simple region-partitioning scheme. We show that the cost 
of the solution produced by the heuristic converges to our lower bound for any <F 
and /i, thus proving the main theorem of the section. 



17. 4 A A Lower Bound 



We introduce a lower bound on the optimal objective function value Z*. Let A C 
1R 2 be the compact support of fi and define d max = sup^ e ^{ | |o:| | } . For a given 
fixed positive integer r > 1, partition the circle with radius d max centered at the 
depot into r rings of equal width. Let d- = (j — for j = 1, 2 , . . . , r, r + 1, 

and construct the following 2 r sets of customers: 



Sj = (.T A; G N dj < d k < d j+1 1 for j = 1, . . . , r, 

and 

r 

Fj = (J for j = 1,2, 

i=j 

Note that F r C F r _i C • • • C U = N since dk > 0 for all yk G N. 
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In the lemma below, we show that |F r | grows to infinity almost surely as n 
grows to infinity. This implies that \Fj\ also grows to infinity almost surely for 
j = 1, 2, . . . , r, since \Fj+i | < \Fj\, for j = 1, 2, . . . , r — 1. The proof follows from 
the definitions of compact support and d max . 

Lemma 17.4.2 

\F r \ 

lim — — = p ( a.s .) for some constant p > 0. 

n— >- oo n 

For any set of customers T C TV, let b*(T) be the minimum number of vehicles 
needed to serve the customers in T; that is, b*(T) is the optimal solution to the 
bin-packing problem defined by item sizes equal to the demands of the customers 
in T. We can now present a family of lower bounds on Z* that hold for different 
values of r > 1. 

Lemma 17.4.3 



Z* > 2 ^ max for any r > 1. 

T 3 = 2 

Proof. Given an optimal solution to the UCVRP, let K * be the number of vehicles 
in the optimal solution that serve at least one customer from S r , and for j = 
1, 2, . . . , r — 1, let K* be the number of vehicles in the optimal solution that serve 
at least one customer in the set Sj but do not serve any customers in Fj + \. Also, 
let V* be the number of vehicles in the optimal solution that serve at least one 
customer in Fj. By these definitions, V* = Y^i=j K-t •> f° r j = 1, 2, . . . , r; hence, 
K] = V; - V* +1 for j = 1, 2, . . . ,r - 1, and K r * = U r *. 

Note that V* > b*(Fj ), for j = 1,2, ... ,r, since V* represents the number of 
vehicles used in a feasible packing of the demands of customers in Fj , while 6* ( Fj ) 
represents the number of bins used in an optimal packing. 

By the definition of K* and Z* > 2 Y^j=i dj^-f and therefore, 

r— 1 

K > 2 d r v; + £ 2 dj (v* - v; + i) 

3 = 1 
r 

= 2d 1 V 1 *+J22(d j -d j _ 1 )V* 

3= 2 
r 

= 2 dj - dj_ x )Vj (since d x = 0) 

3 = 2 



— 2 T/(-i — dj-i)b*(Fj) 



[since V* > 6*(Fj)] 
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Note that Lemma 17.4.3 provides a deterministic lower bound; that is, no 
probabilistic assumptions are involved. Lemmas 17.4.2 and 17.4.3 are both req- 
uired to provide a lower bound on ^Z* that holds almost surely. 

Lemma 17.4.4 Under the conditions of Theorem 17.4-1, we have 

lim — Z* > 2y E(d) ( a.s .). 

n—t oo ^ 



Proof. Lemma 17.4.3 implies that 



lim -z: > 



n— >-oo ^ 



iim y; 



b*(Fj) 



T n— ^oo . 0 

3 = 2 



n 



= 2 - 



.W lim Hm iTii. 

j=2 n ^°° ^j\ ~ 11 



From Lemma 17.4.2, |Fj| grows to infinity almost surely as n grows to infinity, 
for j = 1, 2, . . . , r. Moreover, since demands and locations are independent of each 
other, the demands in Fj, j = 1,2,..., r, are distributed like 4>. Therefore, 



lim 

n— »• oo 



I 



lim 

|^°° 



i^i 



= 7 



(a.s.)- 



Hence, almost surely, 



lim -Z*> 2^1 V 7 lim JL1 

a. u r j n 



= 2 



TLax i . 1 

7 lim — 

F n— >• oo ^ 



r 



J=2 



Since 

r 

F j = {JSi for j = 1,2, ...,r, 

i=j 

we have |Fj| = 1^1 5 hence, almost surely, 



lim — Z* > 2 



dmax 



r r 

iiaa “EE^ 
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By the definition of 

lim \z* u > 2 7 lim = 2 7 lim 



i=2 



3= 1 



since d x = 0 and \S±\ < n. By the definition of d - and Sj, dj > dk — for all 

Xk £ Sj. Then almost surely, 



n— >• 00 U 



lim — Z* >2y lim — (dk — ^ max ) 

- * ~ ” n^oo n ^ ' r 

x k eN 

= 2 y lim — V dk — 27 

n^oo n jL ~^ 
x k £ 

= 2 7^(d) - 27- 



x k eN 
dr n? 



This lower bound holds for arbitrarily large r; hence, 



lim -Z* > 2 7 E(d) (a. 5 .). 

n— >• 00 U 



In the next section, we show that this lower bound is tight by presenting an 
upper bound on the cost of the optimal solution that asymptotically approaches 
the same value. 



17.4-2 An Upper Bound 

We prove Theorem 17.4.1 by analyzing the cost of the following three-step heuristic 
that provides an upper bound on Z*. In the first step, we partition the area A 
into subregions. Then, for each of these subregions, we find the optimal packing of 
the customers’ demands in the subregion, into bins of unit size. Finally, for each 
subregion, we allocate one vehicle to serve the customers in each bin. 

The Region-Partitioning Scheme 

For a fixed h > 0, let G(h) be an infinite grid of squares with side ^ and edges 
parallel to the system coordinates. Recall that A is the compact support of the 
distribution function /i, and let Ai, A 2 , . . . , A t ^ h ) be the intersection of the squares 
of G(h) with the compact support A that have p(Ad) > 0. Note t(h) < 00 since A 
is compact and t(h) is independent of n. 

Let N(i) be the set of customers located in subregion A^ and define n(i) = 
|7V(i)|. For every i = 1, 2, . . . , t(h), let b*(i) be the minimum number of bins 
needed to pack the demands of customers in N(i). Finally, for each subregion 
Ai , z = 1,2,... ,£(/&), let nj(i) be the number of customers in the jth bin of this 
optimal packing, for each j = 1 , 2, . . . , b*(i). 
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We now proceed to find an upper bound on the value of our heuristic. Recall 
that for each bin produced by the heuristic, we send a single vehicle to serve all 
the customers in the bin. First, the vehicle visits the customer closest to the depot 
in the subregion to which the bin belongs, then serves all the customers in the bin 
in any order, and then returns to the depot through the closest customer again. 
Let d(i) be the distance from the depot to the closest customer in iV(i), that is, in 
subregion A{. Note that since each subregion Ai is a subset of a square of side 
the distance between any two customers in Ai is no more than h. Consequently, 
using the method just described, we calculate that the distance traveled by the 
vehicle that serves all the customers in the j th bin of subregion Ai is no more than 



2 d(i) + h(rij(i) + 1). 



Therefore, 



t(h)b* ( i ) 
i= 1 3=1 



t(h) 

< 2 (i)d(i) + 2 nh. 

i= 1 



(17.3) 



This inequality will be coupled with the following lemma to find an almost sure 
upper bound on the cost of this heuristic. 

Lemma 17.4.5 Under the conditions of Theorem 17.4-1, we have 



t{h) 



lim — y ^b*(i)d(i) < r )E(d) (a.s.). 

n i ^ 



n— »• oo Ti 



i= 1 



Proof. Let pi = p(Ai) be the probability that a given customer x k falls in subregion 
Ai. Since pi > 0, by the strong law of large numbers, lim n ^oo = pi almost 
surely, and therefore n(i) grows to infinity almost surely as n grows to infinity. 
Thus, we have 



lim 

n—> oo 



lim 

n(i) n(i)- >oo n(i ) 



= 7 






Hence, 



lim — 

n— >- oo Tl 



t(h) 

^b*(i)d(i) 

i= 1 



t(h) 

— lim — V 

n— >• oo n ' 

i= 1 

t( h ) 

< fhb -V 

n— >• oo n ' 

i= 1 



b*(i) 

5 * (ih 

— - ^2 d k [since d(i) < d k ,Vx k e N(i)] 

n ^ l) x k eN(i ) 



t(h) 

= E 



BE M i 

n —> oo n(i) n— >• oo n 



^2 dk 

x k eN(i ) 



= 7 lim - V d k . 

n— >- oo n z — ' 
x k eN 
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Using the strong law of large numbers, we have 

h K h ) 

lim — y ^b*(i)d(i) < 'yE(d) (a.s.), 

n— »• oo Tl ^ J 
i= 1 

which completes the proof of this lemma. I 

Remark: A simple modification of the proof of Lemma 17.4.5 shows that the ine- 
quality that appears in the statement of the lemma can be replaced by equality 
(see Exercise 17.5). 

We can now finish the proof of the Theorem 17.4.1. From (17.3), we have 

1 2 m 

-Z* u < -Yb*(i)d{i) + 2h. 
n n 

Taking the limits and using Lemma 17.4.5, we obtain 

lim — Z* < 27 E(d) + 2 h ( a.s .). 

n—too Tl 

Since this inequality holds for arbitrarily small h > 0 , we have 

lim — Z* < 2 y E(d) (a.s.). 

n— )>oo Tl 

This upper bound combined with the lower bound of Lemma 17.4.4 proves the 
main theorem. 



17.5 Probabilistic Analysis of Classical Heuristics 

Bienstock et al. (1993a) analyzed the average performance of heuristics that belong 
to the route cluster first-class second. Recall our definition of this class: all those 
heuristics that first order the customers according to their locations and then 
partition this ordering to produce feasible clusters. 

It is clear that the UITP(<a) and UOP(a) heuristics described in Sect. 17.3 belong 
to this class. As mentioned in Sect. 17.2, the sweep algorithm suggested by Gillett 
and Miller can also be viewed as a member of this class. 

Bienstock et al. show that the performance of any heuristic in this class is 
strongly related to the performance of a nonefficient bin-packing heuristic called 
next-fit (NF). The next-fit bin-packing heuristic can be described in the following 
manner. Given a list of n items, start with item 1 and place it in bin 1 . Suppose 
we are packing item j; let bin i be the highest indexed nonempty bin. If item j 
fits in bin i, then place it there; else place it in a new bin indexed i - hi. Thus, 
NF is an online heuristic; that is, it assigns items to bins according to the order in 
which they appear, without using any knowledge of subsequent items in the list. 
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The NF heuristic possesses some interesting properties that will be useful in 
the analysis of the class route first-cluster second. Assume the items are indexed 
1, 2, ... ,n and let a consecutive heuristic be one that assigns items to bins such 
that items in any bin appear consecutively in the sequence. The following is a 
simple observation. 

Property 17.5.1 Among all consecutive heuristics, NF uses the least number 
of bins. 

The next property is similar to a property developed in Sect. 5.2 for b*, the 
optimal solution to the bin-packing problem. 

Property 17.5.2 Let the item sizes W 2 , . . . , w n , . . . in the bin-packing problem 
be a sequence of independent random variables and let b^ F be the number of bins 
produced by NF on the items 1,2 , . . . , n. For every t >0, 

Pr{\b" F - E(b " F ) | > t} < 2exp(-i 2 /8n). (17.4) 

A direct result of this property is the following. The proof is left as an exercise 
(Exercise 17.2). 

Corollary 17.5.3 For any n > 1, 

bn F < £(bn F ) + 4-y/nlogn (a.s.). 

The next property is a simple consequence of the theory of subadditive processes 
(see Sect. 5.2) and the structure of solutions generated by NF. 

Property 17.5.4 For any distribution of item sizes, there exists a constant q NF > 

^nf 

0 such that lim n ^ 00 = q NF almost surely, where b^ F is the number of bins 
produced by the NF packing and q NF depends only on the distribution of the item 
sizes. 

These properties are used to prove the following theorem, the main result of this 
section. 

Theorem 17.5.5 

(i) Let H be a route- first-cluster- second heuristic. Then, under the assumptions 
of Theorem 17.4-1, we have 

lim -Z H > 2 7 NF £(d) (a.s.). 

n— >- oo ^ 

(ii) The UOP(a) heuristic is the best possible heuristic in this class; that is, for 
any fixed a > 1, we have 

lim iz uop (“) = 2-y NF E(d) (a.s.). 

n— >> oo ft 
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In view of Theorems 17.4.1 and 17.5.5, it is interesting to compare 7 NF to 7 
since the asymptotic error of any heuristic H in the class of route first-cluster 
second satisfies 



lim Z H /Z* > lim ^ UOP («)/^* = 7 NF / 7 . 

n— >• 00 n— >-oo 

Although in general the ratio is difficult to characterize, Karmarkarwas able to 
characterize it for the case when the item sizes are uniformly distributed on an 
interval (0, a] for 0 < a < 1. For instance, for a satisfying | < a < 1, we have 

= 1{i^ <15 “ 3 - 9 “ 2 + 3 « - d + ' / 1 ( i 2^) ta ” h (7r) }■ 

so that when the item sizes are uniform ( 0 , 1 ], the above ratio is |, which implies 
that UOP(<a) converge to a value that is 33.3% more than the optimal cost, a 
very disappointing performance for the best heuristic currently available in terms 
of worst-case behavior. 

Moreover, heuristics in the route-first-cluster-second class can never be asymp- 
totically optimal for the UCVRP, except in some trivial cases (e.g., demands are 
all the same size). In fact, Theorem 17.5.5 clearly demonstrates that the route- 
first-cluster-second class suffers from misplaced priorities. The routing (in the first 
phase) is done without any regard to the customer demands, and thus this leads 
to a packing of demands into vehicles that is at best like the next-fit bin-packing 
heuristic. This is clearly subopt imal in all but trivial cases, one being when cus- 
tomers have equal demands, and thus we see the connection with the results of the 
previous chapter. Therefore, this theorem shows that an asymptotically optimal 
heuristic for the UCVRP must use an asymptotically optimal bin-packing heuristic 
to pack the customer demands into the vehicles. 

In the next two subsections, we prove Theorem 17.5.5 by developing a lower 
bound (Sect. 17.5.1) on Z H and an upper bound on ^ UOP ( a ) (Sect. 17.5.2). 



17.5.1 A Lower Bound 

In this section, we present a lower bound on the solution produced by these 
heuristics. Let H denote a route-first-cluster-second heuristic. 

As in Sect. 17.4.1, let A be the compact support of the distribution /i, and 
define d max = sup xG ^{||:r||}. Given a fixed integer r > 1, define dj = ( j — l)^ 3 ^ 
for j = 1 , 2 , . . . , r, and construct the following r sets of customers: 



-{ 



x k eN 



d, < 



dk J” 



for j = 1 



Note that F r C F r _ 1 C ... C F 1; and F± = N since, without loss of generality, 
dk > 0 for all Xk G N. 

Let the customers be indexed xi, £ 2 , • • • , according to the order determined 
by the heuristic H in the route-first phase. 
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For any set of customers T C AT, let 6 NF (T) be the number of bins generated 
by the next-fit heuristic when applied to the bin-packing problem defined by item 
sizes equal to the demands of the customers in T, packed in the order of increasing 
index. 

Lemma 17.5.6 For any r > l, 

Z “ > 2^^'*Tb NF (F j ). 

r 3 = 2 

Proof. For a given solution constructed by H, let V(Fj ) be the number of vehicles 
that serve at least one customer in Fj , for j = 1,2, By this definition, 

V(Fj ) — V(Fj+ 1 ), j = l,2,...,r — 1, is exactly the number of vehicles whose 
farthest customer visited is in Fj but not in Fj+ 1) and trivially V(F r ) is the 
number of vehicles whose farthest customer visited is in F r . Hence, 

r— 1 

Z* > 2 d r V(F r ) + ]T 2dj (v(Fj) - V{F j+1 j) 

3 = 1 
r 

= 2d 1 V{F 1 ) + Y, 2(d,- - dj_ 1 )V(Fj). 

3 = 2 

For a given subset of customers Fj, j = 1,2, ...,r, the V(Fj) vehicles that 
contain these customer demands (in the solution produced by H) can be ordered 
in such a way that the customer indices are in increasing order. Disregarding the 
demands of customers in these vehicles that are not in Fj , this represents the 
solution produced by a consecutive packing heuristic on the demands of customers 
in Fj. By Property 17.5.1, we must have V{Fj) > b NF (Fj ), for every j = 1, 2, . . . , r. 
This, together with d x = 0, dj — dj_ x — implies that 

Z K> 2 Y—b NF (F j ). 

3 = 2 T 



This lemma is used to derive an asymptotic lower bound on the cost of the solu- 
tion produced by H that holds almost surely. The proof of the lemma is identical 
to the proof of Lemma 17.4.4. 

Lemma 17.5.7 Under the conditions of Theorem 17.4-1, we have 
lim -Z* > 2j NF E(d) ( a.s .). 

n— oo ^ 

In the next section, we show that this lower bound is asymptotically tight in the 
case of UOP(<a) by presenting an upper bound that approaches the same value. 
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17.5.2 The UOP(a) Heuristic 

We prove Theorem 17.5.5 by finding an upper bound on Zn° P ^. Let L a be 
the length of the (^-optimal tour selected by UOP(<a). Starting at the depot and 
following the tour in an arbitrary orientation, the customers and the depot are 
numbered x^°\ x^ x \ x^ 2 \ . . . ,x^ n \ where x is the depot. Select an integer m = 
\n@~\ for some fixed [3 G (^, 1), and note that for each such /3, we have lim n ^oo ^ = 
0 [i.e., m = o(n)\ and lim n ^ 00 ^ = 0 [i.e., yjn = o(m)\. We partition the path 
from x (L to x ^ into m + 1 segments, such that each one contains exactly \_f^\ 
customers, except possibly the last one. 

Number the segments 1, 2, . . . , ra+1 according to their appearance on the travel- 
ing salesman tour, where each segment has exactly \_f^\ customers except possibly 
segment ra + 1. Let Li (respectively, Ni) be the length of (respectively, subset of 
customers in) segment i, 1 < i < ra + 1. Finally, let rii = |A^|, i = 1, 2, . . . , ra + 1. 

To obtain an upper bound on the cost of UOP(<a), we apply the next-fit heuristic 
to each segment separately, where items are packed in bins in the same order they 
appear in the segment. This gives us a partition of the tour that must provide an 
upper bound on the cost produced by UOP(<a). Let bf F be the number of bins pro- 
duced by the next-fit heuristic when applied to the customer demands in segment i. 
We assign a single vehicle to each bin produced by the above procedure, each of 
which starts at the depot, visits the customers assigned to its corresponding bin 
in the same order as they appear on the traveling salesman tour, and then returns 
to the depot. Let di be the distance from the depot to the farthest customer in Ni. 
Clearly, the total distance traveled by all the vehicles that serve the customers in 
segment i, 1 < i < ra + 1, is no more than 

26+ di + Li. 

Hence, 

ra +1 

z UOP(a) < 2 ^ 6 NF^. + L a 

i= 1 
m 

<2j2bf F di + 2b^ F +1 d max + aL*. (17.5) 

i=l 

Lemma 17.5.8 Under the conditions of Theorem 17.3-1, we have 

1 rri 

lim — di < 7 NF 12(d) (a.s.). 

n— ► oo n z — ' 
i= 1 

Proof. Since the number of customers in every segment i, 1 < i < ra, is exactly 
n i : LmJ anc ^ lim n _).oo ^ = 0, we have for a given i, 1 < i < ra, 

bf F < E(bf F ) + y/9Kni log ni (a.s.), 



for any K > 2. 
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We now show that, for sufficiently large n, these m inequalities hold 
simultaneously almost surely. To prove this, note that Property 17.5.2 tells us 
that, for rii large enough, the probability that one such inequality does not hold is 
no more than 2 exp (— K logr^) = 2 n^ K . Thus, the probability that at least one of 
these inequalities is violated is no more than 2m(^ — 1)~ K . By the Borel-Cantelli 
lemma, these m inequalities hold almost surely if ^ n m(^°^) K < oc. Choosing 

K > > 3 shows that this holds for any m = [rd 3 ], where | < /3 < 1. 

Thus, 



lim — 

n— )• oo Tl 



E'-r 3<<i Nr 

i= 1 




i= 1 



(a-s.). 



Clearly, di < dk + Li for every Xk C N{ and every i = 1 , 2, . . . , m. Thus, 



di < ^ |^— J d]^j + for every i = 1, 2 . . . , m. 

Xk £iV ^ 



Hence, 



m l_ i i 

lim — di < lim d^ + lim — L a 

i — ^oo ^ ^ rn n— >• oo Tl — 777, ^ ^ n— >-oo 777 

i=l XkEN 

< lim — - — dk + a hm — L*. 

n— ► 00 77 — 777 ^ ' rj, — Von m 



x k £N 



Applying the strong law of large numbers and using lim n ^ 00 ^ = 0, we have 



lim 

n— >-00 77 — 777 



4 = £(d) 

x k EN 



(a.s.). 



Now from Chap. 5, we know that the length of the optimal traveling salesman 
tour through a set of k points independently and identically distributed in a given 
region grows almost surely like Vk. This together with hm n ^oo ^ = 0 implies 
that 

L* 

hm — =0 (a.s.). 

n— >-00 777 

These facts complete the proof. I 

We can now complete the proof of Theorem 17.4.1. From (IT. 5) and Lemma 5.2.1, 
we have 

lim — Z^ OP ^ < 2'y NF E(d) + 2d max lim — 6^+7 + lim — L* (a.s.). 

n— >-00 Tl n—too Tl ' n— >- 00 Tl 

Finally, using Beardwood et al.’s (1959) result (see Theorem 5.3.2) and the fact 
that the number of points in segment m + 1 is at most ^ , we obtain the desired 
result. 
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17.6 The Uniform Model 

To our knowledge, no polynomial-time algorithm that is asymptotically optimal 
is known for the UCVRP for general <F. We now describe such a heuristic for the 
case where <F is uniform on the interval [0, 1]. In the unit interval, it is known that 
there exists an asymptotically optimal solution to the bin-packing problem with 
at most two items per bin. This forms the basis for the heuristic for the UCVRP, 
called optimal matching of pairs (OMP). It considers only feasible solutions in 
which each vehicle visits no more than two customers. Among all such feasible 
solutions, the heuristic finds the one with the minimum cost. This can be done by 
formulating the following integer linear program. 

For every x k ,xi G V, let 

{ d k + dki +di , if k 7 ^ l and w k +wi < 1 ; 

2 d kf if k = Z; 

oo, otherwise. 

The integer program to solve is 

Problem P : Min E CklX k i 

k<l 

S.t. 

T! X kl + ^ Xi k = 1, Vfc = 1, 2 , . . . , n, 

l>k l<k 

X k i G {0, 1}, Vfc < l. 

For k < /, X k i is 1 if a vehicle delivers items to customers x k and xi and is 0 
otherwise. Constraint (17.6) ensures that each customer is visited. 

It is not hard to see that P can be solved in polynomial time since it is no more 
than a classical weighted matching problem defined on a specific graph. Define the 
following graph G = (V, E), where each customer x k is represented by two nodes 
v k and v k , for k = 1 , 2 , . . . , n. The set of edges of G is defined as follows: 

E ={(v k ,v' k )\x k G N} 

U {(v kl vi)\x k e N,xi e N,k^l,w k +wi < 1} 

U {{v r k ,v[)\xk G N,xi G V, k ± l,w k +wi < 1}. 

Thus, G has 2 n vertices. The length of edge (u&,u/), for k ^ /, is c k i , of edge 
(v kl v' k ) is c/e/e, and of edge {y f k ,v[) is 0 , for all k and l. 

Note that any given feasible solution to P can be transformed into a feasible 
solution to the matching problem on G with the same cost. For any feasible solution 
to P, choose edge (v k ,v k ) if customer k is served by a vehicle that does not serve 
any other customer and choose edges (■ v k ,vi ) and (v k ,v[) if customers x k and xi 



(17.6) 

(17.7) 
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are visited together. Similarly, any feasible solution to the matching problem can 
be transformed into a feasible solution to P with the same cost. Hence, the two 
problems are equivalent. 

An optimal matching in G can be found in 0(n 3 ) using Lawler’s (1976) algorithm. 
The main result of this section is the following. 

Theorem 17.6.1 Let Xk, k = 1, 2, . . . , n, be a sequence of independent random 
variables having a distribution fi with compact support in 1R 2 . Let 

E{d) = [ \\x\\ dfi{x). 

JlR 2 

Let the demands Wk, k = 1, 2, . . . , n, be a sequence of independent random variables 
having a uniform distribution on [0,1], and assume that the demands and the 
location of the customers are independent of each other. Then, the OMP heuristic 
is asymptotically optimal That is, with probability 1, 

7 * 7 OMP 

lim = lim = E(d). 

n — ^00 n n — >-00 n 



To prove that the OMP heuristic is asymptotically optimal, we approximate its 
performance by that of the sliced region-partitioning heuristic with parameters h 
and r ( SRP(h,r )). For any fixed positive integer r > 1, the set N is partitioned 
into the following 2 r disjoint subsets, some of which may be empty: 



N :i = {** eA i^( 1- J ~^) < w k < ^(i- ;)}, 3 = l,2,...,r — 1, 



!/ n P 



and 

N J = \ x k e N 
Also, 

and 



{ XkeN \\( 1 + L ^ L ) <vk < + 



1 / j N 

< w k < 2 ( 1 



j = 1,2,. ,.,r- 1. 



N o = {x k e J) <w k < 1} 

N r ^[x k <Wk y 



The number of customers in each Nj (respectively, N J ) is denoted by nj (respec- 
tively, M) for all possible values of j. 

Note that for any j = 1, 2, . . . , r — 1, one vehicle can deliver the demand of a 
customer from Nj together with the demand of exactly one customer from NT 
The SRP(h , r) heuristic generates pairs of customers, one customer from Nj and 
one from AU , for every j = 1 , 2, . . . , r— 1 , using the same region-partitioning scheme 
used in the proof of Theorem 17.4.1 (Sect. 17.4.2). The customers in Nq U N r are 
served separately; a single vehicle is assigned to each of these customers. 
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For every subregion Ai, i = 1 , 2, . . . , t(h), generated by the grid G(h) (see 
Sect. 17.4.2) and for every j = 1,2, ...,r — 1, let Nj(i) [respectively, (i)] be 
the subset of points in Nj (respectively, N J ) that fall in subregion A{. Also, let 
rij{i) = \Nj(i)\ and rU(i) = |AU(i)|. 

In each subregion Ai, i = 1,2,..., t(h), and for any j = 1, 2, . . . , r — 1, we arbi- 
trarily match one customer from Nj(i) with exactly one customer from AU (i); one 
vehicle serves each such pair. If rij(i) = n J (i), then all customers in Nj(i) U 7V J (i) 
are matched and therefore visited in pairs. If, however, rij(i) ^ n J (i), then we can 
match exactly mm{rij(i), n? (i)} pairs of customers. The remaining | rij{i) — n J (i)| 
customers in Nj(i ) U (i) that have not yet been matched are each served by one 
vehicle. Thus, the total number of vehicles used in subregion Ai is 

r— 1 

rio(i) + n r {i ) + ^ max{nj(i), n J (i)}. 

3=1 

The heuristic clearly generates a feasible solution to the UCVRP. Moreover, this 
solution is feasible for P, as each vehicle visits at most two customers. Thus, 

z omp < z SRP(h,r) for any r > i an d h > 0. 

We now proceed by finding an upper bound on z SRP ( h,r \ Essentially the same 
analysis as in Sect. 17.4.2 shows that the total distance traveled by all vehicles is 
no more than 



t(h) 

; E 

i=l 



d(i) no(i) + n r (i) 



r— 1 

E 

3 = 1 



ma x{rij(i),n J (i)} 



2nh. 



Since 



lim 

n(i)— >-o o 



n j{i) 

n{i) 



lim 

n(z)— >• oo 



n J (i) 

n{i) 



1 

2 r 



(a.s.) 



for all j = 1 , 2 , . . . , r, 



we have 



lim 

n(z)— >• oo 



1 

n(i) 



r — 1 

no(i) +n r (i) + ^ max{nj(i), n- 7 (i)} 
3=1 




(a.s.). 



The remainder of the proof is identical to the proof of the upper bound of 
Theorem 17.4.1. 

Therefore, the OMP is asymptotically optimal when demands are uniformly 
distributed between 0 and 1. In fact, the proof can be extended to a larger class of 
demand distributions. For example, for any demand distribution with symmetric 
density, one with f(x) = /(I — x) for x G [0, 1], one can show that the same result 
holds. 
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17.7 The Location-Based Heuristic 

Bramel and Simchi-Levi (1995) used the insight obtained from the analysis of the 
asymptotic optimal solution value (see Theorem 17.4.1 above and the discussion 
that follows it) to develop a new and effective class of heuristics for the UCVRP 
called location-based heuristics. Specifically, this class of heuristics was motivated 
by the following observations. 

A byproduct of the proof of Theorem 17.4.1 is that the region-partitioning 
scheme used to find an upper bound on Z* is asymptotically optimal. Unfor- 
tunately, the scheme is not polynomial since it requires, among other things, 
optimally solving the bin-packing problem. But, the scheme suggests that, asymp- 
totically, the tours in an optimal solution will be of a very simple structure con- 
sisting of two parts. The first is the round trip the vehicle makes from the depot 
to the subregion (where the customers are located) ; we call these the simple tours. 
The second is the additional distance (we call this insertion cost ) accrued by vis- 
iting each of the customers it serves in the subregion. Our goal is therefore to 
construct a heuristic that assigns customers to vehicles so as to minimize the sum 
of the length of all simple tours plus the total insertion costs of customers into each 
simple tour. If done carefully, the solution obtained is asymptotically optimal. 

To construct such a heuristic, we formulate the routing problem as another 
combinatorial problem commonly called [see, e.g., Pirkul (1987)] the single-source 
capacitated facility location problem (CFLP). This problem can be described as 
follows: Given m possible sites for facilities of fixed capacity Q, we would like to 
locate facilities at a subset of these m sites and assign n retailers, where retailer i 
demands Wi units of a facility’s capacity, in such a way that each retailer is assigned 
to exactly one facility, the facility capacities are not exceeded, and the total cost 
is minimized. A site-dependent cost is incurred for locating each facility; that is, 
if a facility is located at site j, the setup cost is fj, for j = 1, 2, . . . , m. The cost of 
assigning retailer i to facility j is c^- (the assignment cost), for i = 1, 2, . . . , n and 
j = 1,2, 

The single-source CFLP can be formulated as the following integer linear pro- 
gram. Let 

1, if a facility is located at site j, 

0, otherwise, 

and let 

1, if retailer i is assigned to a facility at site j, 

0, otherwise. 

n rn m 

Problem CFLP : Min EE Cij^ij T E v jVj 

i= i i= i i=i 

rn 

s.t. Xjj = 1, Vi, 

3 = 1 



(17.8) 
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n 



U WiXij < Q, 

— 1 


Vj, 


(17.9) 


5 

VI 

H « 

1 


Vi, j, 


(17.10) 


Xij E {0, 1}, 


Vi, j, 


(17.11) 


Vi e {o,i}, 


Vj. 


(17.12) 



Constraints (17.8) ensure that each retailer is assigned to exactly one facility, 
and constraints (17.9) ensure that the facility’s capacity constraint is not violated. 
Constraints (17.10) guarantee that if a retailer is assigned to site j, then a facility 
is located at that site. Constraints (17.11) and (17.12) ensure the integrality of the 
variables. 

In formulating the UCVRP as an instance of the CFLP, we set every customer 
Xj in the UCVRP as a possible facility site in the location problem. The length 
of the simple tour that starts at the depot visits customer Xj and then goes back 
to the depot is the setup cost in the location problem (i.e., Vj = 2 dj). Finally, the 
cost of inserting a customer into a simple tour in the UCVRP is the assignment 
cost in the location problem (i.e., Cij = di + dij — dj). This cost should represent 
the added cost of inserting customer i into a simple tour through the depot and 
customer j. Consequently, when i is added to a tour with j, the added cost is 
Cij = di -\-dij — dj , so that Vj +Cij = di -\-dij + dj. However, when a third customer 
is added, the calculation is not so simple, and therefore, the values of should, in 
fact, represent an approximation to the cost of adding i to a tour that goes through 
customer j and the depot. Hence, finding a solution for the CVRP is obtained by 
solving the CFLP with the data as described above. The solution obtained from 
the CFLP is transformed (in an obvious way) to a solution to the CVRP. 

Although APP-Harc} the CFLP can efficiently, but approximately, be solved by 
the familiar Lagrangian relaxation technique (see Chap. 15), as described in Pirkul 
or Bramel and Simchi-Levi (1995), or by a cutting-plane algorithm, as described 
in Deng and Simchi-Levi (1992). 

We can now describe the location-based heuristic (LBH): 

The Location-Based Heuristic 

Step 1: Formulate the UCVRP as an instance of the CFLP. 

Step 2: Solve the CFLP. 

Step 3: Transform the solution obtained in step 2 into a solution for the UCVRP. 

Variations of the LBH can also be applied to other problems; we discuss this 
and related issues in the next chapter, where we consider a more general vehicle 
routing problem. 

The LBH algorithm was tested on a set of 11 standard test problems taken from 
the literature. The problems are in the Euclidean plane, and they vary in size from 
15 to 199 customers. The performance of the algorithm on these test problems 
was found to be comparable to the performance of most published heuristics. This 
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includes both the running time of the algorithm as well as the quality (value) of 
the solutions found; see Bramel and Simchi-Levi (1995) for a detailed discussion. 

One way to explain the excellent performance of the LBH is by analyzing its 
average performance. Indeed, a proof similar to the proof of Theorem 17.4.1 reveals 
[see also Bramel and Simchi-Levi (1995)] that 

Theorem 17.7.1 Under the assumptions of Theorem 17.4-1, there are versions 
of the LBH that are asymptotically optimal; that is, 

lim - Z LBH = 2 7 E(d) ( a.s ). 

n —> oo n 

Finally, we observe that the generalized assignment heuristic due to Fisher and 
Jaikumar (1981) can be viewed as a special case of the LBH in which the seed 
customers are selected by a dispatcher. In the second step, customers are assigned 
to the seeds in an efficient way by solving a generalized assignment problem. The 
advantage of the LBH is that the selection of the seeds and the assignment of 
customers to seeds are done simultaneously, and not sequentially as in the gener- 
alized assignment heuristic. Note that neither of these heuristics (the LBH or the 
generalized assignment heuristic) requires that potential seed points be customer 
locations; both can be easily implemented to start with seed points that are sim- 
ply points on the plane. A byproduct of the analysis, therefore, is that when the 
generalized assignment heuristic is carefully implemented (i.e., “good” seeds are 
selected), it is asymptotically optimal as well. 



17.8 Rate of Convergence to the Asymptotic Value 

While the results in the two previous sections completely characterize the asymp- 
totic optimal solution value of the UCVRP, they do not say anything about the 
rate of convergence to the asymptotic solution value. See Psaraftis (1984) for an 
informal discussion of this issue. 

To get some intuition on the rate of convergence, it is interesting to determine the 
expected difference between the optimal solution for a given number of customers 
n, and the asymptotic solution value (i.e., 2 ^E[d\). This can be done for the 
uniform model discussed in Sect. 17.6. 

In this case, Bramel et al. (1991) and, independently, Rhee (1991) proved the 
following strong result. 

Theorem 17.8.1 Let Xk, k = 1, 2, . . . , n, be a sequence of independent random 
variables uniformly distributed in the unit square [0, l] 2 . Let the demands Wk, k = 
1, 2, . . . , n, be drawn independently from a uniform distribution on (0, 1]. Then 

E[Z* n ]=nE[d] + Q(n 2 / 3 ). 

The proof of Theorem 17.8.1 relies heavily on the theory of three-dimensional 
stochastic matching, which is outside the scope of our survey. We refer the reader to 
Coffman and Lueker (1991, Chap. 3) for an excellent review of matching problems. 
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Rhee has also found an upper bound on the rate of convergence to the asymptotic 
solution value, for general distribution of the customers’ locations and their 
demands. Using a new matching theorem developed together with Talagrand, she 
proved 

Theorem 17.8.2 Under the assumptions of Theorem 17.4-1, we have 
2n / yE[d] < E[Z*\ < 2n^E[d] + 0((nlogn) 2 / 3 ). 



17.9 Exercises 



Exercise 17.1. Consider the following heuristic for the CVRP with unequal 
demands. All customers of demand W{ > \ are served individually, one customer 
per vehicle. To serve the rest, apply the UITP heuristic with vehicle capacity Q. 
Prove that this solution can be transformed into a feasible solution to the CVRP 
with unequal demands. What is the worst-case bound of this heuristic? 

Exercise 17.2. Prove Corollary 17.5.3. 

Exercise 17.3. Given a seed point i , assume you must estimate the cost of the 
optimal traveling salesman tour through a set of points S U {z} using the following 
cost approximation. Starting with 2 di, when each point j is added to the tour, add 
the cost Cij = dj + dij — d{. That is, show that for any r > 1, there is an example 
where the approximation is r times the optimal cost. 

Exercise 17.4. Construct an example of the single-source CFLP where each 
facility is a potential site (and vice versa) in which an optimal solution chooses a 
facility, but the demand of that facility is assigned to another chosen site. 

Exercise 17.5. Show that Lemma 17.4.5 can be replaced by an equality instead 
of an inequality. 

Exercise 17.6. Prove that the version of the LBH with setup costs Vj = 2 dj and 
assignment costs Cij = di + d^ — dj is asymptotically optimal. 

Exercise 17.7. Explain why the following constraints can or cannot be inte- 
grated into the savings algorithm. 

(a) Distance constraint. Each route must be at most A miles long. 

( b ) Minimum route size. Each route must pick up at least m points. 

(c) Mixing constraints. Even indexed points cannot be on the same route as odd 
indexed points. 
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Exercise 17.8. Consider an instance of the CVRP with n customers. A customer 
is red with probability p and blue with probability 1 — p, for some p G [0,1]. Red 
customers have loads of size |, while blue customers have loads of size What is 
lim n ^ 00 ^ as a function of pi 
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The VRP with Time-Window Constraints 



18.1 Introduction 

In many distribution systems, in addition to the load that has to be delivered to it, 
each customer specifies a period of time, called a time window , in which this deliv- 
ery must occur. The objective is to find a set of routes for the vehicles, where each 
route begins and ends at the depot, that serves a subset of the customers without 
violating the vehicle capacity and time- window constraints, while minimizing the 
total length of the routes. We call this model the vehicle routing problem with 
time windows (VRPTW). 

Due to the wide applicability and the economic importance of the problem, 
variants of it have been extensively studied in the vehicle routing literature; for a 
review, see Solomon and Desrosiers (1988). Most of the work on the problem has 
focused on an empirical analysis, while very few papers have studied the problem 
from an analytical point of view. This is done in an attempt to characterize the 
theoretical behavior of heuristics and to use the insights obtained to construct 
effective algorithms. Some exceptions are the recent works of Federgruen and van 
Ryzin (1997) and Bramel and Simchi-Levi (1996). Here we describe the results of 
the latter paper. 

18.2 The Model 

To formally describe the model we analyze here, let the index set of the n customers 
be denoted N = {1, 2, . . . , n}. Let xp £ JR 2 be the location of customer k £ N. 
Assume, without loss of generality, that the depot is at the origin and, by rescaling, 
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that the vehicle capacity is 1 and that the length of the working day is 1 . We assume 
vehicles can leave and return to the depot at any time. Associated with customer 
k is a quadruplet (wk, e*;, S&, Z&), called the customer parameters , which represents, 
respectively, the load that must be picked up, the earliest starting time for service, 
the time required to complete the service, called the service time , and the latest 
time service can end. Clearly, feasibility requires that Ck + Sk < h and Wk,ek,lk C 
[0, 1], for each k E N. 

For any point x E iR 2 , let ||x|| denote the Euclidean distance between x and the 
depot. Let d k = \\xk\\ be the distance between customer k and the depot. Also, let 
djk = \\xj Xk\\ be the distance between customer j and customer k. Let Z* be 
the total distance traveled in an optimal solution to the VRPTW, and let Zf be 
the total distance traveled in the solution provided by a heuristic H. 

Consider the customer locations to be distributed according to a distribution fi 
with compact support in iR 2 . Let the customer parameters {(wk, e/-, Sk, h) • k E 
N} be drawn from a joint distribution with a continuous density <j). Let C be the 
support of (j); that is, C is a subset of {(ai, <22, <23, <24) E [0, l] 4 : a2 + <23 < a4}. Each 
customer is therefore represented by its location in the Euclidean plane along with 
a point in C. Finally, we assume that a customer’s location and its parameters are 
independent of each other. 

In our analysis we associate a job with each customer. The parameters of job 
k are the parameters of customer &, that is, (r^, e/~, S&, Z&), where Wk is referred 
to as the load of job k and, using standard scheduling terminology, represents 
the earliest time job k can begin processing, Sk represents the processing time 
and Ik denotes the latest time the processing of the job can end. The value of 
can be thought of as the release time of job fc, that is, the time it is available for 
processing. The value of Ik represents the due date for the job. Each job can be 
viewed abstractly as simply a point in C. Occasionally, we will refer to customers 
and jobs interchangeably; this convenience should cause no confusion. 

To any set of customers T C N with parameters {(w*;, e&, S&, /&) : k E T}, we 
associate a corresponding machine scheduling problem as follows. Consider the set 
of jobs T and an infinite sequence of parallel machines. Job k becomes available for 
processing at time and must be finished processing by time Ik. The objective 
in this scheduling problem is to assign each job to a machine such that (i) each 
machine has at most one job being processed on it at a given time, (ii) the pro- 
cessing time of each job starts no earlier than its release time and ends no later 
than its due date, and (iii) the total load of all jobs assigned to a machine is no 
more than 1, and the number of machines used is minimized. In our discussion 
we refer to (ii) as the job time-window constraint and to (iii) as the machine load 
constraint. 

Scheduling problems have been widely studied in the operations research liter- 
ature; see Lawler et al. (1993) and Pinedo (1995). Unfortunately, no paper has 
considered the scheduling problem in its general form with the objective function 
of minimizing the number of machines used. 

Observe that in the absence of time window constraints, the scheduling problem 
is no more than a bin-packing problem. Indeed, in that case, the VRPTW reduces 
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to the model analyzed in the previous chapter, the CVRP. Thus, our strategy is 
to try to relate the machine scheduling problem to the VRPTW in much the same 
way as we used results obtained for the bin-packing problem in the analysis of the 
CVRP. As we shall shortly see, this is much more complex. 

Let M*(S) be the minimum number of machines needed to schedule a set S of 
jobs. It is clear that this machine scheduling problem possesses the subadditivity 
property, described in Sect. 5.2. This implies that if M* is the minimum number 
of machines needed to schedule a set of n jobs whose parameters are drawn inde- 
pendently from a distribution <F, then there exists a constant 7 > 0 (depending 
only on <F) such that lim n ^oo M*/n = 7 (a.s.). 

In this chapter, we relate the solution to the VRPTW to the solution to the 
scheduling problem defined by the customer parameters. That is, we show that 
asymptotically the VRPTW is no more difficult to solve than the corresponding 
scheduling problem. Our main result is the following. 

Theorem 18.2.1 Let X\,X 2 , . . . ,x n be independently and identically distributed 
according to a distribution /i with compact support in 1R 2 , and define 



Let the customer parameters {(wk, Ok, Sk, h) • k E N} be drawn independently 
from <f>. Let M* be the minimum number of machines needed to feasibly schedule 
the n jobs corresponding to these parameters, and lim n ^ 00 =7 (a.s.). Then 



We prove this theorem (in Sect. 18.3) by introducing a lower bound on the 
optimal solution value and then developing an upper bound that converges to 
the same value. The lower bound uses a similar technique to the one developed 
in Chap. 17. The upper bound can be viewed as a randomized algorithm that 
is guaranteed to generate a feasible solution to the problem. That is, different 
runs of the algorithm on the same data may generate different feasible solutions. 
In Sect. 18.4, we show that the analysis leads, in a natural way, to the development 
of a new deterministic algorithm that is asymptotically optimal for the VRPTW. 
Though not polynomial, computational evidence shows that the algorithm works 
very well on a set of standard test problems. 

18.3 The Asymptotic Optimal Solution Value 

We start the analysis by introducing a lower bound on the optimal objective 
function value Zf. First, let A be the compact support of /i, and define d max = 
sup{||x|| : x E A}. Pick a fixed integer r > 1, and define d- = ( j — 1)^^, for 
j = 1 , 2 , . . . , r. Now define the sets: 




n— yoo n 



lim — Z* = 27 E(d) (a.s.). 




for j = 1 , 2 , ...,r. 
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For any set T C TV, let M*(T) be the minimum number of machines needed to 
feasibly schedule the set of jobs {(wk, £k,Sk,h) ' k £ T}. The next lemma provides 
a deterministic lower bound on and is analogous to Lemma 17.4.3 developed 
for the VRP with capacity constraints. 

Lemma 18.3.1 

z* > 2 

r 3 = 2 

Proof. Let Vf be the number of vehicles in an optimal solution to the VRPTW 
that serve a customer from Fj, for j = 1, 2, . . . , r. By this definition, V* is exactly 
the number of vehicles whose farthest customer visited is in F r , and V* — V* +1 
is exactly the number of vehicles whose farthest customer visited is in Fj \ Fj+ 1- 
Observe that if V* = V* +1 , then there are no vehicles whose farthest customer 
visited is in Fj \ F j+ 1- Consequently, 

r— 1 

z;>2d r v; + J22d j (v*-v* +1 ) 

3= 1 
r 

= 2d 1 V 1 *+Y / 2(d j -d J _ 1 )V* 

3 = 2 

r J=2 

We now claim that for each j = 1, 2, . . . , r, > M*(Fj). This should be clear 
from the fact that the set of jobs in Fj can be feasibly scheduled on V* machines 
by scheduling the jobs at the times they are served in the VRPTW solution. I 
We can now determine the asymptotic value of this lower bound. This can be 
done in a similar manner to that of Chap. 17, and hence we omit the proof here. 

Lemma 18.3.2 Under the conditions of Theorem 18.2.1, 

lim — Zl > 2y E(d) ( a.s .). 

n—t oo Tl 

We prove Theorem 18.2.1 by approximating the optimal cost from above by that 
of the following four-step heuristic. In the first step, we partition the region where 
the customers are distributed into subregions. In the second step, we randomly 
separate the customers of each subregion into two sets. Then for each subregion, 
we solve a machine scheduling problem defined on the customers in one of these 
sets. Finally, we use this schedule to specify how to serve all the customers in the 
subregion. 

Pick an e > 0, and let S be given by the definition of continuity of 0; that is, 
S > 0 is such that for all x,y £ C with \\x — y\\ <5, we have | <f>(x) — <p(y) \ < e. 
Finally, pick a A < min{^g, e}. 
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Let G( A) be an infinite grid of squares of diagonal A, that is, of side with 
edges parallel to the system coordinates. Recall that A is the compact support of 
//, and let A\, A 2 , . . . , ^t(A) be the subregions of G( A) that intersect A and have 
> 0 . 

Let N(i) be the indices of the customers located in subregion Ai, and define 
n(i) = |7V(i)|. For each customer k G N(i), with parameters (wk,ek, Sk,h), we 
associate a job with parameters (wk,ek,Sk + A, Ik + A). For any set T C N of 
customers, let M^(T) be the minimum number of machines needed to feasibly 
schedule the set of jobs { (re/- , e/~ , Sk + A, Ik + A) : k G T}. In addition, for any set 
T of customers, let T{i) = N(i) D T, for i = 1, 2, . . . , t( A). 

For the given grid partition and for any set T C N of customers, the following 
is a feasible way to serve the customers in N. All subregions are served separately; 
that is, no customers from different subregions are served by the same vehicle. 
In subregion Ai, we solve the machine scheduling problem defined by the jobs 
{(wk, e/c, Sk + A, Ik + A) : k G T(i)}. Then, for each machine in this scheduling 
solution, we associate a vehicle that serves the customers corresponding to the jobs 
on that machine. The customers are visited in the exact order they are processed 
on the machine, and they are served in exactly the same interval of time as they 
are processed. This is repeated for each machine of the scheduling solution. The 
customers of the set N(i)\T(i) are served one vehicle per customer. This strategy 
is repeated for every subregion, thus providing a solution to the VRPTW. 

We will show that for a suitable choice of the set T, this routing strategy is 
asymptotically optimal for the VRPTW. An interesting fact about the set T is 
that it is a randomly generated set; that is, each time the algorithm is run, it 
results in different sets T. 

The first step is to show that, for any set T C N (possibly empty), the solu- 
tion produced by the above-mentioned strategy provides a feasible solution to the 
VRPTW. This should be clear from the fact that having an extra A units of time 
to travel between customers in a subregion is enough since all subregions have 
diagonal A. Therefore, any sets of customers scheduled on a machine together can 
be served together by one vehicle. Customers of N(i) \ T can clearly be served 
within their time windows since they are served individually, one per vehicle. 

We now proceed to find an upper bound on the value of this solution. For each 
subregion Ai, let nj(i) be the number of jobs on the j th machine in the optimal 
schedule of the jobs in T(i), for each j = 1 , 2, . . . , Let d(i) be the 

distance from the depot to the closest customer in N{i), that is, in subregion A^. 
Using the routing strategy described above, we specify that the distance traveled 
by the vehicle serving the customers whose job was assigned to the j th machine 
of subregion Ai is no more than 



2 d(i) + A (nj(i) + 1). 
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Therefore, 



t(A)M*(T(p) 

^*<E E ^2 d(i) + A (rij(i) + l)j + ^ 2 d k 

i = 1 j = 1 /c^T 

t(A) 

< 2 e m a( t (*)mw + 2nA + E 

i=l /c^T 



Dividing by n and taking the limit, we have 
lim — Z+ < 

n— >• 00 77, 



< 



In order to relate this quantity to the lower bound of Lemma 18 . 3 . 2 , we must 
choose the set T appropriately. For this purpose, we make the following obs- 
ervation. Recall that 0 is the continuous density associated with the distribu- 
tion <F. The customer parameters (w k ,e k ,s k ,l k ) of each of the customers of A 
are drawn randomly from the density 0 . Associated with each customer is a job 
whose parameters are perturbed by A in the third and fourth coordinates, that is, 
(w k , e k , s k + A, l k + A). This is equivalent to randomly drawing the job parameters 
from a density that we call (j)' . The density 0 ' can be found simply by translating 0 
by A in the third and fourth coordinates, that is, for each x = (#1, 02, 03, #4) G 7 R 4 , 
<j)'(x) = 0 '( 0 i, #2, 03, #4) = 0(01,02,03 — A, 04 — A). Finally, for each x G 7 R 4 , define 
0(x) = min{0(x), 0'(x)} and let q = J^ 4 0 < 1. 

The n jobs (or customer parameters) = (w k ,e k ,s k + A,l k + A) : k G A} 
are drawn randomly from the density 0 ', and our task is to select the set T C A. 
To simplify presentation, we refer interchangeably to the index set of jobs and to 
the set of jobs itself; that is, k G A will have the same interpretation as y k G A, 
where y k = (w k ,e k , s k + A, l k + A). 

For each job y k , generate a random value, call it uniformly in [ 0 , 0 ' (?/&)]. The 
point ( y k ,u k ) G iR 5 is a point below the graph of 0 '; that is, u k < (j)'(y k ). Define 
T as the set of indices of jobs whose u k value falls below the graph of 0 ; that is, 
T = {k G A : u k < 00 //-)}. Then the set of jobs TgT} can be viewed as a 
random sample of |Tj jobs drawn randomly from the density 



t(A) 



2 lim — Ml(T(i))d(i) + 2A + lim — 2 d k 

< n-i-oo fl n-^no n ' 



i=l 

t(A) 



n— )>oo fi 



k£T 



2 V lim 

' T7. — Vrv 



n(i) M* A (T(i)) ... 



— 0 n-s>oo fl n(i) 



d(i) + 2A + lim — y 2d/c 

n — Vno ^ 



k£T 



t( A) 



— n(i) — M A (T(i)) 

w AV v d(z) + 



74 m 

2 V lim lim 

^ ^ n— >• 00 fl n—too fl(i) 
i= 1 v ' 



2A + lim — y^ 2 d k . ( 18 . 1 ) 

n— 7 >oc fi ' 



k£T 
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In order to relate this upper bound to the lower bound, we need to present the 
following lemma. 



and for each subregion Ai, 

(a-s-). 

Proof. To prove the result for a given subregion we construct a feasible schedule 
for the set of jobs {yk = (wk, e^, Sk + A, Ik + A) : k £ T(i)}. Generate n(i) — \T(i)\ 
jobs randomly from the density 

Call this set of jobs D , for dummy jobs. From the construction of the sets D and 
T(i), it is a simple exercise to show that the parameters of the jobs in D U T(i) 
are distributed like </>. 

A feasible schedule of the jobs in T(i) is obtained by optimally scheduling the 
jobs in D U T(i) using, say, Mi machines. The number of machines needed to 
schedule the jobs in T(i) is obviously no more than M$, since the jobs in D can 
simply be ignored. Thus, we have the bound 

Ml(T(i)) < Mi. 

Now dividing by n(i) and taking the limits, we get 



Lemma 18.3.3 For 

t = l,2,...,i(A), 



T generated as above 



r- — M* a 
hm 

n— OO 77,17,1 






< 7, 



lim 

n— Yoo 



T(i)) 

n(i) 



- — Mi . v 

< lim —pr = 7 ^ (a-s-), 

n—> oo n{l) 



since the set of jobs DUT(i) is just a set of n(i) jobs whose parameters are drawn 
independently from the density <f. I 

Lemma 18.3.3 thus reduces (18.1) to 



t(A) 



lim — < 2 V 7 lim d(i ) + 2 A + lim — 2 dk 

n— >oo n ' ^ r > — 71 n.—± oo m < ^ 



2=1 



n-i-oo n 



t( A) 



n— >• oo n 



k(£T 



27 lim — ^ n(i)d(i) + 2 A + lim — 2 dk 

n, — 77 ' rj, — n ‘ ^ 



2=1 



ktfzT 



< 27 lim — V dk + 2 A + lim — 2dk 

n —^00 n ^ ' n— >• 00 r> ^ 

keN 

= 2 r yE(d) + 2 A + lim — 2dk 

n — ^-oo Ti ' 



k£T 



k£T 



< 2'yE(d) + 2 A + 2 d max lim -|A r \T| 

n—?>oo n 



The next lemma determines an upper bound on lim n ^oo ^|7V \ T\. 
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Lemma 18.3.4 Given e > 0 and T generated as above, 

Bm 1|AT\T| < (l + e) 2 e (a.s.). 

n— >- oo n 



Proof. By the strong law of large numbers, the limit is equal to the probability 
that a job of N is not in the set T. The probability of a particular job not being 
in T is simply 

I if <f>'{y k )><f>(y k ), 

\ 0 , otherwise. 

Hence, almost surely, 



lim 

n— >• oo 



-\N\T\= j max { - ^ , oX<p'(x)da 

n Jm f <p(x) J 



4>'(x) 
(j) r (x) — (j)(x) 



(j)\x)dx 



I]R 4 (j)'{x) 

= f \(j)' (x) — <fi(x)\dx 
JlR 4 

= [ 0 2 , Os, 6b) - 0 2 , Os, 0 4z )\d(0i,0 2 , Os, 6b) 

JlR 4 

= [ \(/)(0 1 ,0 2 ,0s-A,0 4 -A)-^0 1 ,0 2 ,0s,0 4 M0 1 ,0 2 ,0s,0^ 

JlR 4 

< (1 + A) 2 e 

< (1 + e ) 2 C 



where the second-to-last inequality follows from \\(0i,0 2 , Os — A, 6 b — A) — ( 6 ^i , 6 ^ 2 , 
6 b, 04 )|| < A\/2 < S and the continuity of (j). I 

We now have all the necessary ingredients to finish the proof of Theorem 18.2.1; 
thus, 

lim — ZJ 5 C 2yT/(d) T 2d max (l T e)^e T 2A (< 2 . 5 .). 

n— >• 00 n 

Since e was arbitrary, and recalling that A < e, we have 

lim — Z+ < 2 y E(d) (a.s.). 

n—too n 

This upper bound combined with the lower bound proves Theorem 18.2.1. 
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18.4 An Asymptotically Optimal Heuristic 

In this section, we generalize the LBH heuristic developed for the CVRP (see 
Chap. 17) to handle time window constraints. Similar to the original LBH, we 
prove that the generalized version is asymptotically optimal for the VRPTW. 
We refer to this more general version of the heuristic also as the location-based 
heuristic; this should cause no confusion. 

18.4-1 The Location- Based Heuristic 

The LBH can be viewed as a three- step algorithm. In the first step, the parame- 
ters of the VRPTW are transformed into data for a location problem called the 
capacitated vehicle location problem with time windows (CVLPTW), described 
below. This location problem is solved in the second step. In the final step, we 
transform the solution to the CVLPTW into a feasible solution to the VRPTW. 

The Capacitated Vehicle Location Problem with Time Windows 

The capacitated vehicle location problem with time windows (CVLPTW) is 
a generalization of the single-source capacitated facility location problem (see 
Sect. 17.7) and can be described as follows: We are given m possible sites to locate 
vehicles of capacity Q. There are n customers geographically dispersed in a given 
region, where customer i has Wi units of product that must be picked up by a 
vehicle. The pickup of customer i takes Si units of time and must occur in the 
time window between times and / r that is, the service of customer i can start 
at any time t G [e$, k — s*]. The objective is to select a subset of the possible sites, 
to locate one vehicle at each site, and to assign the customers to the vehicles. 
Each vehicle must leave its site, pick up the load of customers assigned to it in 
such a way that the vehicle capacity is not exceeded and all pickups occur within 
the customer’s time window, and then return to its site. The costs are as follows: 
A site-dependent cost is incurred for locating each vehicle; that is, if a vehicle is 
located at site j, the setup cost is Vj, for j = 1,2, ... , m. The cost of assigning 
customer i to the vehicle at site j is c^- (the assignment cost), for i = 1, 2, . . . , n 
and j = 1, 2, . . . , m. We assume that there are enough vehicles and sites so that a 
feasible solution exists. 

The CVLPTW can be formulated as the following mathematical program. Let 



1, if a vehicle is located at site j 
0, otherwise, 



and let 




1, if customer i is assigned to the vehicle at site j 
0, otherwise. 
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For any set 5CiV, let fj(S) = 1 if the set of customers S can be feasibly served 



in their time windows by one vehicle that starts and ends at site j (disregarding 


the capacity constraint), and 0 otherwise. 

n m rn 




Problem P : Min 


i= 1 J=1 j= 1 

m 




s.t. 


Xb*? = i) 

3=1 


(18.2) 




n 

^WiXij^Q, Vj, 

A — 1 


(18.3) 




•<s> 

> 

VI 

H « 

1 


(18.4) 




fjdi-.Xij = 1}) = 1, Vj, 


(18.5) 




Xij,yj e {o,i}, Vi, j. 


(18.6) 



Constraints (18.2) ensure that each customer is assigned to exactly one vehicle, 
and constraints (18.3) ensure that the vehicle’s capacity constraint is not violated. 
Constraints (18.4) guarantee that if a customer is assigned to the vehicle at site 
j, then a vehicle is located at that site. Constraints (18.5) ensure that the time- 
window constraints are not violated. Constraints (18.6) ensure the integrality of 
the variables. 

The Heuristic 

To relate the CVLPTW to the VRPTW, consider each customer in the VRPTW 
to be a potential site for a vehicle; that is, the set of potential sites is exactly the set 
of customers, and therefore, m = n. Picking a subset of the sites in the CVLPTW 
corresponds to picking a subset of the customers in the VRPTW; we call this set 
of selected customers the seed customers. These customers are those that will form 
simple tours with the depot. 

In order for the LBH to perform well, the costs of the CVLPTW should app- 
roximate the costs of the VRPTW. The setup cost for locating a vehicle at site 
j (vj ) , or, in other words, of picking customer j as a seed customer, should 
be the cost of sending a vehicle from the depot to customer j and back (i.e. , 
the length of the simple tour). Hence, we set Vj = 2 dj for each j E N. The assign- 
ment cost Cij is the cost of assigning customer i to the vehicle at site j. There- 
fore, this cost should represent the added cost of inserting customer i into the 
simple tour through the depot and customer j. Consequently, when i is added 
to a tour with j, the added cost is Cij = di + dij — dj, so that Vj + = 

di + d^ + dj. This cost is exact for two and sometimes three customers. How- 
ever, as the number of customers increases, the values of c^- in fact represent an 
approximation to the cost of adding i to a tour that goes through customer j and 
the depot. In Sect. 18.4.3, we present values of c^- that we have found to work well 
in practice. 
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Once these costs are determined, the second step of the LBH consists of solving 
the CVLPTW. The solution provided is a set of sites (seed customers) and a set 
of customers assigned to each of these sites (to each seed) . This solution can then 
be easily transformed into a solution to the VRPTW, since a set of customers that 
can be feasibly served starting from site j can also be feasibly served starting from 
the depot. 

18-4-2 A Solution Method for CVLPTW 

The computational efficiency of the LBH depends on the efficiency with which the 
CVLPTW can be solved. We therefore present a method to solve the CVLPTW. 
As discussed earlier, the CVLPTW without constraints (18.5) is simply the 
single-source capacitated facility location problem (CFLP) for which efficient sol- 
ution methods exist based on the celebrated Lagrangian relaxation technique; see 
Sect. 6.3. For the CVLPTW, we use a similar method, although the specifics are 
more complex in view of the existence of these time window constraints. 

In this case, for a given multiplier vector A E iR n , constraints (18.2) are relaxed 
and put into the objective function with the multiplier vector. The resulting prob- 
lem can be separated into n subproblems (one for each of the n sites), since con- 
straints (18.2) are the only constraints that relate the sites to one another. The 
subproblem for site j is 

n 

Problem Pj : Min E CijXij + Vj yj 

i= 1 

n 

s.t. YwjXjj < Q 

i= 1 

Xij<Vj, Vi, 

fj({i • %ij = 1 }) = 1, 

x ij e {0,1}, Vi and yj E {0, 1}, 
where Ci j = Qj + A for each i E N. 

In the optimal solution to Problem Pj, yj is either 0 or 1. If yj = 0, then = 0 
for all i E V, and the objective function value is 0. If yj = 1, then the problem 
reduces to a different, but simpler, routing problem. Consider a vehicle of capacity 
Q initially located at site j. The driver gets a profit of pij = — Cij for picking up 
the Wi items at customer i in the time window (e$, U). The pickup operation takes 
Si units of time. The objective is to choose a subset of the customers, to pick up 
their loads in their time windows, without violating the capacity constraint, using 
a vehicle that must begin and end at site j, while maximizing the driver’s profit. 
Let G* be the maximum profit attainable at site j; that is, G* is the optimal 
solution to the problem just described for site j. This implies that Vj — G* is the 
optimal solution value of Problem Pj given that yj = 1. Therefore, we can write 
the optimal solution to Problem Pj as simply min{0,fj — G*}. 
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Unfortunately, in general, determining the values G* for j E N is ./VP- Hard. 
We can, however, determine upper bounds on G*; call them Gj. This provides a 
lower bound on the optimal solution to Problem Pj : which is equal to min{0, Vj — 
Gj}. We use the simple bound given by Gj = }Pij- Consequently, 

Y^j=i min{0, Vj — Gj} — Y^i=i ^ a l° wer bound on the optimal solution 
to the CVLPTW. 

To generate a feasible solution to the VRPTW at each iteration of the procedure, 
we use information from the upper bounds on profit Gj for j E N . After every 
iteration of the lower bound (for each multiplier), we renumber the sites so that 
Gi > G 2 > • • • > G n . The upper bounds on profit are used as an estimate of 
the profitability of placing a vehicle at a particular site. For example, site 1 is 
considered to be a “good” site (or seed customer), since a large profit is possible 
there. A large profit for site j corresponds to a seed customer where neighboring 
customers can be feasibly served from it at low cost. Therefore, a site with large 
profit is selected as a seed customer since it will tend to have neighboring customers 
around it that can be feasibly served by a vehicle starting at that site. 

To generate a feasible solution to the CVLPTW, we do the following: Starting 
with j = 1 in the new ordering of the sites (customers), we locate a vehicle at 
site j. For every customer still not assigned to a site, we first determine if this 
customer can be feasibly served with the customers that are currently assigned 
to site j. Then, of the customers that can be served from this site, we determine 
the one that will cause the least increase in cost, that is, the one with minimum 
Cij over all customers i that can be served from this site. We then assign this 
customer to the site. We continue until no more customers can be assigned to site 
j, due to capacity or time constraints. We then increment j to 2 and continue 
with site 2. After all customers have been feasibly assigned to a site, we obtain a 
feasible solution whose cost is compared to the cost of the current best solution. 

As we find solutions to the CVLPTW, we also generate feasible solutions to the 
VRPTW, using the information from the lower bound to the CVLPTW. Starting 
with j = 1, pick customer j as a seed customer. Then, for every customer that 
can be feasibly served with this seed, we determine the added distance this would 
entail; that is, we determine the best place to insert the customer into the current 
tour through the customers assigned to seed j. We choose the customer that causes 
the least increase in distance traveled as the one to assign to seed j. This idea is 
similar to the nearest-insertion heuristic discussed in Sect. 4.3.2. We then continue 
trying to add customers in this way to seed j. Once no more can be added to this 
tour (due to capacity or time constraints), we increment j to 2, select seed customer 
2, and continue. Once every customer appears in a tour, that is, every customer 
is assigned to a seed, we have a feasible solution to the VRPTW corresponding to 
the current set of multipliers. The cost of this solution is compared to the cost of 
the current best solution. 

Multipliers are updated using (6.6). The step size is initially set to 2 and halved 
after the lower bound has not improved in a series of 30 iterations. After the step 
size has reached a preset minimum (0.05), the heuristic is terminated. 
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18.4-3 Implementation 

It is clear that many possible variations of the LBH can be implemented depending 
on the type of assignment costs (cij) used. In the computational results discussed 
below, the following have been implemented. 



Direct cost Qj has the advantage that, when several customers are added to the 
seed, the resulting cost, which is the sum of the setup costs and these direct costs, 
is an upper bound on the length of any efficient route through the customers. On 
the other hand, the nearest-insertion cost works well because it is accurate at least 
for tours through two customers, and often for tours through three customers as 
well. 

Several versions of the LBH have been implemented and tested. In the first, the 
star-tours (ST) heuristic, the direct assignment cost is used, while in the second, 
the seed-insertion (SI) heuristic, the nearest-insertion assignment cost is applied. 
Observe that the LBH is not a polynomial-time heuristic. However, as we shall 
shortly demonstrate, the running times reported on standard test problems are 
very reasonable and are comparable to the running times of many heuristics for 
the vehicle routing problem. 

The ST heuristic is of particular interest because it is asymptotically optimal as 
demonstrated in the following lemma. The proof is similar to the previous proofs 
and is therefore omitted. 

Lemma 18.4.1 Let n customers , indexed by TV, be independently and identically 
distributed according to a distribution p with compact support in M 2 . Define 



Let the customer parameters {(wk,ek,Sk,lk) : & G TV} be jointly distributed like 
4>. In addition, let M* be the minimum number of machines needed to feasibly 
schedule the jobs {(wk,ek,Sk,lk) : k G TV} ; and let lim n ^ooM*/n = 7 , ( a.s .). 
Then 



18.4-4 Numerical Study 

Tables 18.1 and 18.2 summarize the computational experiments with the stan- 
dard test problems of Solomon (1986). The problem set consists of 56 problems 
of various types. All problems consist of 100 customers and one depot, and the 
distances are Euclidean. Problems with the “R” prefix are problems where the 
customer locations are randomly generated according to a uniform distribution. 



direct cost : = 2 d{j, and 

nearest-insertion cosh c^j = di + dij — dj. 




n— >• 00 n n—t 00 n 



lim — Z ST = lim — Z t * = 2 y E(d) (a.s.). 
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TABLE 18.1. Computational results: Part I 



Problem 


Alg. ST 


CPU 
time (s) 


Alg. SI 


CPU 
time (s) 


Solomon’s 
best solution 


C201 


591.6 


245.9 


591.6 


260.5 


591 


C202 


*652.8 


276.1 


*640.8 


262.7 


731 


C203 


*692.2 


309.2 


*741.1 


308.9 


786 


C204 


*721.6 


335.9 


782.3 


340.6 


758 


C205 


713.8 


250.8 


699.9 


258.8 


606 


C206 


770.8 


257.3 


*722.8 


283.3 


730 


C207 


767.2 


265.7 


708.9 


275.8 


680 


C208 


736.2 


287.7 


660.2 


272.4 


607 


R201 


*1665.3 


207.1 


*1533.4 


209.6 


1,741 


R202 


*1485.3 


276.4 


*1484.3 


248.5 


1,730 


R203 


*1371.5 


406.5 


*1349.3 


389.0 


1,567 


R204 


1096.7 


532.0 


1077.0 


538.2 


1,059 


R205 


1472.3 


287.0 


*1329.4 


312.6 


1,471 


R206 


*1237.0 


412.2 


*1283.7 


374.2 


1,405 


R207 


*1217.7 


484.8 


*1162.9 


453.9 


1,241 


R208 


* 966.1 


587.8 


* 959.9 


612.6 


1,046 


R209 


*1276.1 


394.8 


*1262.8 


355.7 


1,418 


R210 


*1312.5 


380.7 


*1340.6 


388.6 


1,425 


R211 


1080.9 


474.7 


1141.3 


488.7 


1,016 


RC201 


*1873.8 


203.5 


*1841.7 


185.8 


1,880 


RC202 


*1742.1 


227.8 


*1705.1 


241.0 


1,799 


RC203 


*1417.5 


331.5 


*1471.1 


300.1 


1,550 


RC204 


*1139.6 


437.7 


*1190.3 


411.5 


1,208 


RC205 


*1830.5 


233.0 


*1878.9 


214.0 


2,080 


RC206 


1640.1 


259.0 


1607.5 


248.2 


1,582 


RC207 


*1566.4 


294.2 


*1557.3 


272.3 


1,632 


RC208 


1254.8 


345.7 


1298.7 


317.3 


1,194 



* indicates that the LBH improves upon the best solution known 



Problems with the “C” prefix are problems where the customer locations are clus- 
tered. Problems with the “RC” prefix are a mixture of both random and clustered. 
In addition, all of the problems have a constraint on the latest time To at which a 
vehicle can return to the depot. For a full description of these problems, we refer 
the reader to Solomon. 

We compare the performance of the LBH against the heuristics of Solomon and 
the column-generation approach of Desrochers et al. (1992). The latter method 
was able to solve effectively 7 of the 56 test problems; we describe this approach 
in the next chapter. 

To compare the LBH to these solution methods, we implemented a time- window 
reduction phase before the start of the heuristic. Here, the earliest time for service 
ek is replaced by max{e/j, dk}] in that way, vehicles leave the depot no earlier than 
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TABLE 18.2. Computational results: Part II 



Problem 


CPU 

Alg. ST time (s) 


CPU 

Alg. SI time (s) 


Solomon’s 
best solution 


DDS solution 
value 


C101 


828.9 


74.1 


828.9 


67.0 


829 


827.3 


C102 


982.8 


82.9 


1043.4 


73.1 


968 


827.3 


C103 


*1015.1 


95.9 


1232.9 


88.4 


1,026 




C104 


*980.9 


105.4 


*976.1 


114.5 


1,053 




C105 


*828.9 


79.7 


860.8 


67.3 


829 




C106 


852.9 


82.8 


880.1 


66.7 


834 


827.3 


C107 


828.9 


83.1 


841.2 


74.7 


829 


827.3 


C108 


852.9 


88.6 


853.6 


80.9 


829 


827.3 


C109 


991.0 


88.6 


1014.5 


83.1 


829 




R101 


1983.7 


57.2 


2071.2 


39.9 


1,873 


1607.7 


R102 


1789.0 


70.8 


1821.4 


57.4 


1,843 


1434.0 


R103 


1594.5 


88.6 


1599.1 


67.9 


1,484 




R104 


1242.0 


106.2 


1237.3 


81.0 


1,188 




R105 


1604.4 


67.0 


1696.2 


52.0 


1,502 




R106 


1606.9 


78.0 


1589.2 


70.0 


1,460 




R107 


*1324.9 


92.4 


1361.2 


70.4 


1,353 




R108 


1202.6 


107.5 


1205.5 


101.1 


1,134 




R109 


1504.7 


78.5 


1491.8 


69.6 


1,412 




R110 


1380.9 


92.0 


1434.4 


69.4 


1,211 




Rill 


1422.1 


91.7 


1432.4 


69.5 


1,202 




R112 


1248.1 


105.2 


1284.6 


79.4 


1,086 




RC101 


2045.1 


60.6 


2014.4 


45.0 


1,867 




RC102 


1806.6 


68.7 


1969.5 


52.2 


1,760 




RC103 


1708.9 


81.7 


1716.3 


69.6 


1,641 




RC104 


1372.1 


93.5 


1458.8 


79.5 


1,301 




RC105 


*1826.3 


68.9 


2036.8 


51.3 


1,922 




RC106 


1710.8 


68.0 


1804.8 


50.5 


1,611 




RC107 


1593.2 


76.4 


1630.9 


64.9 


1,385 




RC108 


1421.0 


84.7 


1493.8 


65.5 


1,253 





indicates that the LBH improves upon the best solution known 



time 0. In addition, the latest time service can end Ik is replaced by mini//., To—dk}- 
The LBH can then be run as it is described in Sect. 18.4.1. 

As can be seen in the tables, both the ST and SI heuristics have been imple- 
mented. CPU times are in seconds on a Sun SPARC Station II. In Tables 18.1 
and 18.2, the column “Solomon’s best solution” corresponds to the best solution 
found by Solomon. Solomon tested eight different heuristics on problem sets R1 
and Cl, and six heuristics on problems RC1, R2, C2, and RC2. We see that the 
ST heuristic provides a better solution than Solomon’s heuristics in 25 of the 56 
problems, while the SI heuristic provides a better solution in 21 of the 56 problems. 
In Table 18.2, the column “DDS solution value” corresponds to the value of the 
solution found using the column-generation approach of Desrochers et al. 
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18.5 Exercises 

Exercise 18.1. You are given a network G = (V, A), where \V\ = n, d(i,j) is 
the length of edge (i, j) and a specified vertex a eV. One service unit is located 
at a and has to visit each vertex in V so that the total waiting time of all vertices 
is as small as possible. Assume the waiting time of a vertex is proportional to the 
total distance traveled by the server from a to the vertex. The total waiting time 
(summed up over all customers) is then 

(n — 1 )d(a, 2) + (n — 2)d(2, 3) + (n — 3)d(3, 4) + • • • + d(n — 1, n). 

The delivery man problem (DMP) is the problem of determining the tour that 
minimizes the total waiting time. 

Assume that G is a tree with d(i,j) = 1 for every (i,j) £ A. Show that any tour 
that follows a depth-first search starting from a is optimal. 

Exercise 18.2. Consider the delivery man problem described in Exercise 18.1. 
A delivery man currently located at the depot must visit each of n customers. Let 
Z DM be the total waiting time in the optimal delivery man tour through the n 
points. Let Z* be the total time required to travel the optimal traveling salesman 
tour through the n points. 

(a) Prove that 



(b) One heuristic proposed for this problem is the nearest-neighbor (NN) heuris- 
tic. In this heuristic, the vehicle serves the closest unvisited customer next. 
Provide a family of examples to show that the heuristic does not have a fixed 
worst-case bound. 

Exercise 18.3. Consider the vehicle routing problem with distance constraints. 
Formally, a set of customers has to be served by vehicles that are all located at 
a common depot. The customers and the depot are presented as the nodes of 
an undirected graph G = (N,E). Each customer has to be visited by a vehicle. 
The j th vehicle starts from the depot and returns to the depot after visiting a 
subset Nj C N. The total distance traveled by the j th vehicle is denoted by 
Tj. Each vehicle has a distance constraint A: No vehicle can travel more than 
A units of distance (i.e., Tj < A). We assume that the distance matrix satisfies 
the triangle inequality assumption. Also, assume that the length of the optimal 
traveling salesman tour through all the customers and the depot is greater than A. 

(a) Suppose the objective function is to minimize the total distance traveled. Let 
iL* be the number of vehicles in an optimal solution to this problem. Show 
that there always exists an optimal solution with total distance traveled 
> \ K* A. Does this lower bound hold for any optimal solution? 
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(b) Consider the following greedy heuristic: Start with the optimal traveling 
salesman tour through all the customers and the depot. In an arbitrary 
orientation of this tour, the nodes are numbered (io, ii, . . . , i n ) = S in order 
of appearance, where n = the number of customers, io is the depot, and 
ii,Z 2 , ••• ,i n are the customers. We break the tour into K H segments and 
connect the endpoints of each segment to the depot. This is done in the 
following way. Each vehicle j, 1 < j < K H , starts by traveling from the 
depot to the first customer i q not visited by the previous j — 1 vehicles 
and then visits the maximum number of customers according to S without 
violating the distance constraint upon returning to the depot. 

Show that K H < min{n, [ 1 } ? where T is the length of the optimal travel- 

ing salesman tour and d m is the distance from the depot to the farthest customer. 

Exercise 18.4. Consider the pickup and delivery problem. Here customers are 
pickup customers with probability p and delivery customers with probability 1 —p. 
Assume a vehicle capacity of 1. If customer i is a pickup customer, then a load 
of size Wi < 1 must be picked up at the customer and brought to the depot. 
If customer i is a delivery customer, then a load of size Wi < 1 must be brought 
from the depot to the customer. Assume pickup sizes are drawn randomly from a 
distribution with bin-packing constant yp and delivery sizes are drawn randomly 
from a distribution with bin-packing constant yp>. A pickup and a delivery can be 
in the vehicle at the same time. 

(a) Develop a heuristic H for this problem and determine lim^oo as a func- 
tion of p, 7 p, and yp. 

(b) Assume all pickups are of size | and deliveries are of size |. Suggest a 
better heuristic for this case. What is lim n ^. 00 as a function of p for this 
heuristic? 



19 

Solving the VRP Using a 
Column-Generation Approach 



19.1 Introduction 

A classical method, first suggested by Balinski and Quandt (1964) , for solving 
the VRP with capacity and time- window constraints, is based on formulating the 
problem as a set-partitioning problem. (See Chap. 6 for a general discussion of set 
partitioning.) The idea is as follows: Let the index set of all feasible routes be 
{1,2,..., R} and let c r be the length of route r. Define 

f 1, if customer i is served in route r, 

— S 

( 0, otherwise, 

for each customer i = 1 , 2, . . . , n and each route r = 1 , 2, . . . , R. Also, for every 
r == 1 , 2, . . . , R, let 

f 1, if route r is in the optimal solution, 
y r = < 

( 0, otherwise. 

In the set-partitioning formulation of the VRP, the objective is to select a minimum- 
cost set of feasible routes such that each customer is included in some route. It is 
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19. Solving the VRP Using a Column- Generation Approach 



R 

Problem S : Min c r y r 

r = 1 
R 

s.t. ^]ctj r y r > 1, Vi = 1 , 2 , . . . , n, (19.1) 

r=l 

2/ r G{0, 1}, Vr = l,2 

Observe that we have written constraints (19.1) as inequality constraints instead 
of equality constraints. The formulation with equality constraints is equivalent if 
we assume the distance matrix {dij} satisfies the triangle inequality; therefore, each 
customer will be visited exactly once in the optimal solution. The formulation with 
inequality constraints will prove to be easier to work with from an implementation 
point of view. 

Cullen et al. (1981) was the first to use this formulation successfully to design 
heuristic methods for the VRP. Later, Desrochers et al. (1992) used it in con- 
junction with a branch- and-bound method to generate optimal or near-optimal 
solutions to the VRP. Similar methods have been used to solve crew scheduling 
problems, such as Hoffman and Padberg (1993). 

Of course, the set of all feasible routes is extremely large, and one cannot expect 
to generate it completely. Even if this set is given, it is not clear how to solve the 
set-partitioning problem since it is a large-scale integer program. Any method 
based on this formulation must overcome these two obstacles. We start here, in 
Sect. 19.2, by showing how the linear relaxation of the set-partitioning problem 
can be solved to optimality without enumerating all possible routes. In Sect. 19.3, 
we combine this method with a polyhedral approach that generates an optimal or 
near-optimal solution to the VRP. Finally, in Sect. 19.4, we provide a probabilistic 
analysis that helps explain why a method of this type will be effective. 

To simplify the presentation, we assume no time-window constraints exist ; the 
extension to the more general model is straightforward, for the most part. The 
interested reader can find some of these extensions in Desrochers et al. (1992). 



19.2 Solving a Relaxation of the Set-Partitioning 
Formulation 

To solve the linear relaxation of Problem S without enumerating all the routes, 
Desrochers et al. (1992) use the celebrated column-generation technique. A thor- 
ough explanation of this method is given below, but the general idea is as follows. 
A portion of all possible routes is enumerated, and the resulting linear relaxation 
with this partial route set is solved. The solution to this linear program is then 
used to determine if there are any routes not included that can reduce the objective 
function value. This is the column- generation step. Using the values of the optimal 
dual variables (with respect to the partial route set), the program generates a new 
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route, and the linear relaxation is resolved. This is continued until one can show 
that an optimal solution to the linear program is found, one that is optimal for 
the complete route set. 

Specifically, this is done by enumerating a partial set of routes, 1, 2 , . . . , i?', and 
formulating the corresponding linear relaxation of the set-partitioning problem 
with respect to this set: 



R' 

Problem S' : Min c r y r 

r=l 

S.t. 

R' 

'Y^otiryr > 1, Vi = l,2,...,n, (19.2) 

r= 1 

Hr > 0, Vr = 1, 2, . . . , R ' . 

Let y be the optimal solution to Problem 5", and let 7 T be the corresponding 
optimal dual variables. We would like to know whether y (or equivalently, W) is 
optimal for the linear relaxation of Problem S (respectively, the dual of the linear 
relaxation of Problem S). To answer this question, observe that the dual of the 
linear relaxation of Problem S is 

n 

Problem Sd : Max 7 q 

s.t. 

n 

^ air'll i < c r , Vr 1, 2, . . . , R, (19.3) 

i= 1 

7Ti > 0, Vi = 1, 2, . . . , n. 

Clearly, if W satisfies every constraint (19.3), then it is optimal for Problem Sd, 
and therefore, y is optimal for the linear programming relaxation of Problem S. 
How can we check whether W satisfies every constraint in Problem Sd*? Observe 
that the vector 7 f is not feasible in Problem Sd if we can identify a single constraint, 
r, such that 

n 

^ ^ CXiv^ i ^ Cfm 
i= 1 

Consequently, if we can find a column r minimizing the quantity c r — ^ir^i 
and this quantity is negative, then a violated constraint is found. In that case, 
the current vector 7 f is not optimal for Problem Sd- The corresponding column 
just found can be added to the formulation of Problem Sp , which is solved again. 
The process repeats itself until no violated constraint (column) is found; in this 
case, we have found the optimal solution to the linear relaxation of Problem S 
(the vector y) and the optimal solution to Problem Sd (the vector 7 f). 
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Our task is then to find a column, or a route, r minimizing the quantity 

n 

C r ^ ^ (19.4) 

i 

We can look at this problem in a different way. Suppose we replace each distance 
dij with a new distance d[ - defined by 




Then a tour ui U 2 ue whose length using {dij} is Yl\=\ d Ui u i+1 + d UiUl 
has, using {cJL}, a length 

e - i i - i 

'y v d U jUi +1 T d U£U% = 'y ^ d UiUi+1 + d U £ Ul 

i= 1 i= 1 

Hence, finding a route r that minimizes (19.4) is the same as using the distance 
matrix {d[-} to find a minimum- length tour that starts and ends at the depot, visits 
a subset of the customers, and has a total load no more than Q. Unfortunately, 
this itself is an ./VP-Hard problem, and so we are left with a method that is not 
attractive computationally. 

To overcome this difficulty, the set-partitioning formulation, Problem S', is mod- 
ified to allow routes to visit the same customer more than once. The purpose of 
this modification will be clear in a moment. This model, call it Problem Sm (where 
M stands for the “modified” formulation), is defined as follows: Enumerate all fea- 
sible routes, satisfying the capacity constraint, that may visit the same customer a 
number of times; each such visit increases the total load by the demand of that cus- 
tomer. Let the number of routes (columns) be Rm, and let c r be the total distance 
traveled in route r. For each customer i = 1, 2, . . . , n and route r = 1,2,..., Rm , 
let 

£i r = number of times customer i is visited in route r. 

Also, for each r = 1,2,..., Rm , define 

f 1, if route r is in the optimal solution, 
y r = \ 

1 0, otherwise. 

The VRP can be formulated as 

Rm 

Problem Sm • Min E CrVr 

r= 1 

S.t. 

Rm 

'^izrUr > 1, Vi = 1,2, . ..,71, 

r=l 

Hr G {0,1}, 



(19.5) 



t 

i = 1 



Vr = 1,2,..., Rm • 
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This is the set-partitioning problem solved by Desrochers et al. (1992), and 
therefore, it is not exactly Problem S. Clearly, the optimal integer solution to 
Problem Sm is the optimal solution to the VRP. However, the optimal solution 
values of the linear relaxations of Problem Sm and Problem S may be different. 
Of course, the linear relaxation of Problem Sm provides a lower bound on the 
linear relaxation of Problem S. 

To solve the linear relaxation of Problem Sm , we use the method described above 
(for solving Problem S): We enumerate a partial set of R' M routes; solve Problem 
S' M , which is the linear relaxation of Problem Sm defined only on this partial 
list; use the dual variables to see whether a column not in the current partial list 
with CirTTi > c r exists. If such a column(s) exists, we add it (them) to the 

formulation and solve the resulting linear program again. Otherwise, we have the 
optimal solution to the linear relaxation of Problem Sm • 

The modification we have made makes the column-generation step computa- 
tionally easier. This can now be found in pseudopolynomial time using dynamic 
programming. 

For this purpose, we need the following definitions. Given a path P = {0, U 2 , 

,Ui}, where it is possible that iq = Uj for i ^ j, let the load of this path be 

YU=\ w ui • That is, the load of the path is the sum, over all customers in P, of the 
demand of a customer multiplied by the number of times that customer appears 
in P. Let f q (i) be the cost [using {dC}] of the least-cost path that starts at the 
depot and terminates at vertex i with total load q. This can be calculated using 
the recursion 



Finally, let f q (i) = f q (i) -p d' 0i . Thus, f q (i) is the length of a least-cost tour that 
starts at the depot, visits a subset of the customers, of which customer i is the last 
to be visited, has a total load g, and terminates at the depot. Observe that finding 
f q (i) for every g, 1 < q < Q, and every i, i G AT, requires 0(n 2 Q) calculations. The 
recursion chooses the predecessor of i to be a node j ^ i. This requires repeat visits 
to the same customer to be separated by at least one visit to another customer. 
In fact, expanding the state space of this recursion can eliminate two-loops : loops of 
the type — This forces repeat visits to the same customer to be separated 

by visits to at least two other customers. This can lead to a stronger relaxation 
of the set-partitioning model. For a more detailed discussion of this recursion, see 
Christofides et al. (1981). 

If there exist a g, 1 < q < Q, and i, i G N with f q (i) < 0, then the current 
vectors y and 7 f are not optimal for the linear relaxation of Problem Sm • In such 
a case, we add the column corresponding to this tour [the one with negative f q (i)] 
to the set of columns in Problem S' M . If, on the other hand, f q (i) > 0 for every q 
and i, then the current y and 7 f are optimal for Sm • 



fq(i) = min 



mm {f q -wi(j)+ d ij}> 



(19.6) 



with the initial conditions 




d'o i if q = Wi, 

Too otherwise. 
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To summarize, the column-generation algorithm can be described as follows. 
The Column-Generation Procedure 
Step 1: Generate an initial set of R' M columns. 

Step 2: Solve Problem S' M and find y and if. 

Step 3: Construct the distance matrix {d[-} and find /P(g) for 

1 <q<Q. 

Step 4: For every i and q with /P (q) < 0, add the corresponding 
and go to step 2. 

Step 5: If fi(q) > 0 for all i and q , stop. 

The procedure produces a vector y that is the optimal solution to the linear 
relaxation of Problem Sm • This is a lower bound on the optimal solution to the 
VRP. 



19.3 Solving the Set-Partitioning Problem 

In the previous section, we introduced an effective method for solving the linear 
relaxation of the set-partitioning formulation of the VRP, Problem Sm • How can 
we use this solution to the linear program to find an optimal or near-optimal 
integer solution? 

We’ll start with the set of columns present at the end of the column-generation 
step (the set E); one approach to generating an integer solution to the set- 
partitioning formulation is to use the branch- and-bound method. This method 
consists of splitting the problem into easier subproblems by fixing the value of 
a branching variable. The variable (in this case, a suitable choice is y r for some 
route r) is set to either 1 or 0. Each of these subproblems is solved using the same 
method; that is, another variable is branched. At each step, tests are performed to 
see if the entire branch can be eliminated; that is, no better solution than the one 
currently known can be found in this branch. The solution found by this method 
will be the best integer solution among all the solutions in E. This solution will 
not necessarily be the optimal solution to the VRP, but it may be close. 

Another approach that will generate the same integer solution as the branch- 
and-bound method is the following. Given a fractional solution to Sm, we can 
generate a set of constraints that will cut off this fractional solution. Then we 
can resolve this linear program; if it is integer, we have found the optimal integer 
solution (among the columns of E). If it is still fractional, then we can continue 
generating constraints and resolving the linear program until an integer solution 
is found. Again, the best integer solution found using this method may be close to 
optimal. This is the method successfully used by Hoffman and Padberg (1993) to 
solve crew scheduling problems. 



all i £ N and 
column to R' M 
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Formally, the method is as follows. 

The Cutting-Plane Algorithm 

Step 1: Generate an initial set of R' M columns. 

Step 2: Use column generation to solve Problem S' M . 

Step 3: If the optimal solution to Problem S' M is integer, stop. 

Else, generate cutting planes separating this solution. 

Add these cutting planes to the linear program S' M . 

Step 4 : Solve the linear program S' M . Go to step 3. 

To illustrate this constraint-generation step (step 3), we make use of a number of 
observations. First, let E be the set of routes at the end of the column- generation 
procedure. Clearly, we can split E into two subsets. One subset E m includes every 
column r for which there is at least one i with <^ r > 2; these columns are called 
multiple- visit columns. The second subset E s includes the remaining columns; these 
columns are referred to as single-visit columns. It is evident that an optimal sol- 
ution to the VRP will use no columns from E m . That is, there always exists a 
single-visit column of at most the same cost that can be used instead. We there- 
fore can immediately add the following constraint to the linear relaxation of Prob- 
lem Sm- 



To generate more constraints, construct the intersection graph G. The graph G 
has a node for each column in E s . Two nodes in G are connected by an edge if 
the corresponding columns have at least one customer in common. Observe that 
a solution to the VRP where no customer is visited more than once can be repre- 
sented by an independent set in this graph. That is, it is a collection of nodes on 
the graph G such that no two nodes are connected by an edge. 

These observations give rise to two inequalities that can be added to the 
formulation: 

1. We select a subset of the nodes of G, say A, such that every pair of nodes 
i, j G K is connected by an edge of G. Each set A, called a clique , must 
satisfy the following condition: 



Clearly, if there is a node j ^ K such that j is adjacent to every i Gif, then 
we can replace K with K U {j} in inequality (19.8) to strengthen it (this is 
called lifting). In that sense, we would like to use inequality (19.8) when the 
set of nodes K is maximal in that sense. 




(19.7) 



reE, 




(19.8) 



reK 
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2. Define a cycle G = {ui,U2 7 • • • , ug} in G such that node Ui is adjacent to 
Ui+ 1 , for each i = 1 , 2 ,...,£— 1 , and node is adjacent to node 14 . A cycle 
C is called an odd cycle if the number of nodes in G, \C\ = £, is odd. An odd 
cycle is called an odd hole if there is no arc connecting two nodes of the cycle 
except the i arcs defining the cycle. It is easy to see that in any optimal 
solution to the VRP, each odd hole must satisfy the following property: 

(19.9) 

rec 

19.3.1 Identifying Violated Clique Constraints 

Hoffman and Padberg suggest several procedures for clique identification, one of 
which is based on the fact that small problems can be solved quickly by enumer- 
ation. For this purpose, select v to be the node with minimum degree among all 
nodes of G. Clearly, every clique of G containing v is a subset of the neighbors 
of v, denoted by neigh(v). Thus, starting with v as a temporary clique, that is, 
K = {v}, we add an arbitrary node w from neigh{v ) to K. We now delete from 
neigh{v ) all nodes that are not connected to a node of K, in this case either v or 
w. Continue adding nodes in this manner from the current set neigh(v ) to K until 
either there is no node in neigh(v) connected to all nodes in iC, or neigh(v) = 0 . 
In the end, K will be a maximal clique. We can then calculate the weight of this 
clique, that is, the sum of the values (in the linear program) of the columns in 
the clique. If the weight is more than 1, then the corresponding clique inequality 
is violated. If not, then we continue the procedure with a new starting node. The 
method can be improved computationally by, for example, always choosing the 
“heaviest” among those nodes eligible to enter the clique. 

19.3.2 Identifying Violated Odd Hole Constraints 

Hoffman and Padberg use the following procedure to identify violated odd hole 
constraints. Suppose y is the current optimal solution to the linear program and 
G is the corresponding intersection graph. Starting from an arbitrary node v £ 
G, construct a layered graph Gi(v) as follows. The node set of G^{v) is the 
same as the node set of G. Every neighbor of v in G is connected to v by an 
edge in Gi(v). We refer to v as the root, or level-0 node, and we refer to the 
neighbors of v as level- 1 nodes. Similarly, nodes at level k > 2 are those nodes 
in G that are connected (in G) to a level- k — 1 node but are not connected 
to any node at level < k — 1. Finally, each edge ( u^Uj ) in Gt(v) is assigned 
a length of 1 — y u . — y u . > 0. Now pick a node u in Gi(v) at level k > 2 
and find the shortest path from u to v in Gi(v). Delete all nodes at levels i 
(1 < i < k) that are either on the shortest path or adjacent to nodes along 
this shortest path (other than nodes that are adjacent to v). Now pick another 
node w that is adjacent (in G) to u in level k. Find the shortest path from w 
to v in the current graph Gt(v). Combining these two paths with the arc (u,w) 
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creates an odd hole. If the total length of this cycle is less than 1, then we have 
found a violated odd hole inequality. If not, we continue with another neigh- 
bor of u and repeat the process. We can then choose a node different from u 
at level k. If no violated odd hole inequality is found at level k, we proceed to 
level k + 1. This subroutine can be repeated for different starting nodes (v) as 
well. 



19.4 The Effectiveness of the Set-Partitioning 
Formulation 

The effectiveness of this algorithm depends crucially on the quality of the initial 
lower bound; this lower bound is the optimal solution to the linear relaxation of 
Problem Sm- If this lower bound is not very tight, then the branch- and-bound or 
the constraint-generation method will most likely not be computationally effective. 
On the other hand, when the gap between the lower bound and the best integer 
solution is small, the procedure will probably be effective. 

Fortunately, many researchers have reported that the linear relaxation of the 
set-partitioning problem, Problem Sm, provides a solution close to the optimal 
integer solution [see, e.g., Desrochers et al. (1992)]. That is, the solution to the lin- 
ear relaxation of Problem Sm provides a very tight lower bound on the solution of 
the VRP. For instance, in their paper, Desrochers et al. report an average relative 
gap between the optimal solution to the linear relaxation and the optimal integer 
solution of only 0.733%. A possible explanation for this observation is embod- 
ied in the following theorem, which states that asymptotically the relative error 
between the optimal solution to the linear relaxation of the set-partitioning model 
and the optimal integer solution goes to zero as the number of customers increases. 
Consider again the general VRP with capacity and time- window constraints. 

Theorem 19.4.1 Let the customer locations xi,X 2 , • • • ,x n be a sequence of in- 
dependent random variables having a distribution /i with compact support in M 2 . 
Let the customer parameters (see Chap. 18) be independently and identically dis- 
tributed like <1>. Let Z LP be the value of the optimal fractional solution to S, and 
let Z* be the value of the optimal integer solution to S, that is, the value of the 
optimal solution to the VRP. Then 

lim — Z LP = lim — Z* (a.s.). 

n— >- oo n n— >- oo n 

The theorem thus implies that the optimal solution value of the linear program- 
ming relaxation of Problem S tends to the optimal solution of the vehicle routing 
problem as the number of customers tends to infinity. This is important since, as 
shown by Bramel and Simchi-Levi (1997), other classical formulations of the VRP 
can lead to diverging linear and integer solution values (see Exercise 19.8). 

In the next section, we motivate Theorem 19.4.1 by presenting a simplified model 
that captures the essential ideas of the proof. Finally, in Sect. 19.4.2, we provide 
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a formal proof of the theorem. Again, to simplify the presentation, we assume no 
time- window constraints exist; for the general case, the interested reader is referred 
to Bramel and Simchi-Levi (1997). 

19.4.1 Motivation 

Define a customer type to be a location x G M 2 and a customer demand w; that 
is, a customer type defines the customer location and a value for the customer 
demand. Consider a discretized vehicle routing model in which there are a finite 
number W of customer types and a finite number m of distinct customer locations. 
Let rii be the number of customers of type z, for i = 1, 2, . . . , W, and let n = 
Y^iLi n i he the total number of customers. Clearly, this discretized vehicle routing 
problem can be solved by formulating it as a set-partitioning problem. To obtain 
some intuition about the linear relaxation of S', we introduce another formulation 
of the vehicle routing problem closely related to S. 

Let a vehicle assignment be a vector (ai, a 2 , . . . , aw), where ai > 0 are integers, 
and such that a single vehicle can feasibly serve a\ customers of type 1, and <22 cus- 
tomers of type 2, . . . , and aw customers of type W together without violating the 
vehicle capacity constraint. Index all the possible vehicle assignments 1, 2, . . . , R a 
and let c r be the total length of the shortest feasible route serving the customers 
in vehicle assignment r. (Note that R a is independent of n.) The vehicle routing 
problem can be formulated as follows. Let 

Ai r — number of customers of type i in vehicle assignment r, 
for each i— 1,2, ... ,W and r = 1, 2, . . . , R a . Let 

y r = number of times vehicle assignment r is used in the optimal solution. 
The new formulation of this discretized VRP is 



Ra 

Problem Sn • Min E y r c r 

r=l 

s.t. 

Ra 

^2yrAr>ni, Vi= 1,2, 

r= 1 

y r > 0 and integer, Vr = 1, 2, . . . , R a . 

Let Z X be the value of the optimal solution to Problem Sn and let be the 
optimal solution to the linear relaxation of Problem S^. Clearly, Problem S and 
Problem Sn have the same optimal solution values, that is, Z* = while their 
linear relaxations may be different. Define c = max r =i ; 2 ,...,i? a {c r }; that is, c is the 
length of the longest route among the R a vehicle assignments. Using an analysis 
identical to the one in Sect. 6.2, we obtain 
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Lemma 19.4.2 

Z LP <z* < Z FP + Wc< Z LF + Wc. 

Observe that the upper bound on Z* obtained in Lemma 19.4.2 consists of two 
terms. The first, Z LP , is a lower bound on Z*, which clearly grows with the number 
of customers n. The second term (Wc) is the product of two numbers that are 
fixed and independent of n. Therefore, the upper bound on Z* of Lemma 19.4.2 is 
dominated by Z LP , and consequently, we see that for large n, Z* ~ Z LP , exactly 
what is implied by Theorem 19.4.1. Indeed, much of the proof of the following 
section is concerned with approximating the distributions fi (customer locations) 
and (customer demands) with discrete distributions and forcing the number of 
different customer types to be independent of n. 

19.4.2 Proof of Theorem 19.4-1 

It is clear that Z LP < Z*; therefore, lim n ^(Z* — Z LP ) > 0. The interesting 
part is to find an upper bound on Z* that involves Z LP and use this upper bound 
to show that lim n ^ 00 ^-(Z* — Z LP ) < 0. We do this in essentially the same way as 
in Sect. 19.4.1. We successively discretize the problem by introducing a sequence 
of vehicle routing problems whose optimal solutions are “relatively” close to Z*. 
The last vehicle routing problem is a discrete problem, which, therefore, as in 
Sect. 19.4.1, can be directly related to the linear relaxation of its set-partitioning 
formulation. This linear program is also shown to have an optimal solution close 
to Z LP . 

To prove the upper bound, let N be the index set of customers, with \N\ = n, 
and let Problem P be the original VRP. Let A be the compact support of the 
distribution of the customer locations (/i), and define d max = sup{||x|| : x G A}, 
where \\x\\ is the distance from point x G A to the depot. Finally, pick a fixed 
k > 1 . 

Discretization of the Locations 

We start by constructing the following vehicle routing problem with discrete 
locations. Define A = ^ and let G( A) be an infinite grid of squares of diag- 
onal A, that is, of side ^=, with edges parallel to the system coordinates. Let 
Ai, A 2 , . . . , ArffA) subregions of G( A) that intersect A and have jn(Ai) > 0. 

Since A is bounded, m( A) is finite for each A > 0. For convenience, we omit the 
dependence of m on A in the notation. For each subregion, let X{ be the centroid 
of subregion A^ that is, the point at the center of the grid square containing Ai. 
This defines m points , X m ; note that a customer is at most ^ units 

from the centroid of the subregion in which it is located. 

Construct a new VRP, called P(m), defined on the customers of N. Each of the 
customers in N is moved to the centroid of the subregion in which it is located. 
Let Z*(m) be the optimal solution to P(m). We clearly have 

Z* < Z*(m) + n A. 



(19.10) 
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Discretization of the Customer Demands 

We now describe a VRP where the customer demands are also discretized in 
much the same way as it is done in Sect. 6.2. Partition the interval (0, 1] into 
subintervals of size A(= ^). This produces k segments and I = k — 1 points in the 
interval (0, 1), which we call corners. 

We refer to each centroid-corner pair as a customer type; each centroid defines 
a customer location and each corner defines the customer demand. It is clear that 
there are ml possible customer types. An instance of a fully discretized vehicle 
routing problem is then defined by specifying the number of customers of each of 
the ml types. 

For each centroid j = 1, 2, . . . , m and corner i = 1, 2, ...,/, let 

{ i — 1 i 'i 

h £ N : — — < Wh < -r and x h £ Aj >. 

rv K J 

Finally, for every j = 1, 2, . . . , m and i = 1, 2, ...,/, let = | Nji | . 

We now define a fully discretized vehicle routing problem P/ C (m), whose optimal 
solution value is denoted Z^(m). The vehicle routing problem Pk{m) is defined 
as having min{n^, customers located at centroid j with customer demand 

equal to for each 2 = 1,2,...,/ and j = 1, 2, . . . , m. 

We have the following result. 

Lemma 19.4.3 

m I 

Z*(m) < Z* k (m)+2d 

max EE 

3= 1 i= 1 



Proof. Observe: 

(i) In P/ C (m), the number of customers at centroid j and with demand defined by 

corner i is min{n^, nj^+i}. 

(ii) In P(m), each customer belongs to exactly one of the subsets Nji, for j = 

1, 2, . . . , m and 2 = 1,2,...,/. 

( 222 ) In P(m), the customers in Nji have smaller loads than the customers of 
P/e(m) at centroid j with demand defined by corner 2 . 

Given an optimal solution to P/ C (m), let us construct a solution to P(m). For 
each centroid j = 1, 2, . . . , m and corner 2 = 1,2,...,/, we pick any max{n^ — 
0} customers from Nji and serve them in individual vehicles. The remaining 
min{nji, customers in Nji can be served with exactly the same vehicle 
schedules as in P/ C (m). This can be done due to ( 222 ); therefore, one can always serve 
customers with a demand of P{m) in the same vehicles in which the customers of 
P/c(m) are served. I 

Now Pfc(m) is fully discrete, and we can apply results as in Sect. 19.4.1. Let 
Z^ p (m) be the optimal solution to the linear relaxation of the set-partitioning 
formulation of the routing problem P/ C (m). Let c be defined as in Sect. 19.4.1; that 
is, it is the cost of the most expensive tour among all the possible routes in P/ C (m). 



19.4 The Effectiveness of the Set-Partitioning Formulation 371 



Lemma 19.4.4 



Z* k (m) < Z]f(rn)+mlc. 

Proof. Since the number of customer types is at most ml , we can formulate Pfc(ra) 
as the integer program, like Problem Sn , described in Sect. 19.4.1, with ml con- 
straints. The bound then follows from Lemma 19.4.2. I 

Recall that Z LP is the optimal solution to the linear relaxation of the set- 
partitioning formulation of the VRP defined by Problem P. Then 
Lemma 19.4.5 



Z pp (m) < Z LP +nA. 

Proof Let {y r : r = 1 , 2 ,..., R} be the optimal solution to the linear relaxation of 
the set-partitioning formulation of Problem P. We can assume (see Exercise 19.3) 
that Er=l Vr a ir = 1 , for each i = 1, 2, . . . , n. We construct a feasible solution to 
the linear relaxation of the set-partitioning formulation of P ^ (m) using the values 
y r . Since every customer in P/c(m) assigned to centroid j and corner i can be 
associated with a customer in P with Xk £ Aj and whose demand is at least as 
large, each route r with y r > 0 can be used to construct a route r' feasible for 
P k (m ) . Since in P& (m) the customers are at the centroids instead of at their original 
locations, we modify the route so that the vehicle travels from the customer to its 
centroid and back. Thus, the length (cost) of route r' is at most the cost of route 
r in P plus n r A, where n r is the number of customers in route r. 

To create a feasible solution to the linear relaxation of the set-partitioning for- 
mulation to P/c(m), we take the solution to the linear relaxation of P and create 
the routes r' as above. Therefore, 



R 

Z pp (m) < Z LP + ^^y r n r A < Z LP + nA. 

r — 1 



We can now prove Theorem 19.4.1. 
Z* < Z*(m) + nA 



m I 



— 2d m ax ^ ^ ^ ^ | 'Fiji ^j,i~ h 1 1 ^A 

j= 1 i=l 



m I 



< Z pp (m) + mlc + 2 d max EE | riji - n,j }i+ 1 | + nA 

j = i *= i 



m I 



E E T mlc T 2d max EE | riji - 7/ . 1 1 + 2nA. 

3=1 i= 1 
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We now need to show that Z LP is the dominant part of the last upper bound. 
We do that using the following lemma. 

Lemma 19.4.6 There exists a constant K such that 

rri I 



hm — y^ y^ \nni 

n— >-oo 3 

3 = 1 i=l 



I 2K 
n j,i+ 1| — ^ ’ 



Proof. In Sect. 6.2, we prove that given i and j, there exists a constant K such 
that 

- — 1, ,2 K 

hm -| riji - n j:i + 1 | < 



n— >• oo Tl 

Therefore, a similar analysis gives 

m I 



k 2 ' 



— 1 ^ N 2 K 2K 

lim - >, > - n i)i+ i | < > -w ^ 

n—> oo TL z ^ z z ^ /c 

J=1 i=l J=1 



/c 



Finally, observe that each tour in P/ C (m ) has a total length no more than 1 since 
the truck travels at a unit speed and the length of each working day is 1. Hence, 
mlc = 0(1), and therefore, 

lim - (Z* - Z LP ) < M max ti + 2A 

n— oo n k 

= —(2 iCd max + 1). 

Since iC is a constant and k was arbitrary, we see that the right-hand side can be 
made arbitrarily small. Therefore, 

0 < lim - (Z* - Z LP ) < lim - (Z* - Z hp ) < 0. 

n->> oo Tl oo n 

We conclude this chapter with the following observation. The proof of Theo- 
rem 19.4.1 also reveals an upper bound on the rate of convergence of Z LP to its 
asymptotic value. Indeed (see Exercise 19.1), we have 

E(Z*) < E(Z lp ) + 0(n 3/4 ). (19.11) 



19.5 Exercises 



Exercise 19.1. Prove the upper bound on the convergence rate (19.11). 

Exercise 19.2. Consider an undirected graph G = (V, P), where each edge (i, j) 
has a cost cij and each vertex i E V a nonnegative penalty 7 r$. In the prize- 
collecting traveling salesman problem (PCTSP), the objective is to find a tour 
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that visits a subset of the vertices such that the length of the tour plus the sum of 
the penalties of all vertices not in the tour is as small as possible. Show that the 
problem can be formulated as a longest-path problem between two prespecified 
nodes of a new network. 

Exercise 19.3. Consider the bin-packing problem. Let Wi be the size of item i, 

i = 1, ,n, and assume the bin capacity is 1. An important formulation of the 

bin-packing problem is as a set-covering problem. Let 

F = {S:^i< !}• 

ies 



Define 

f 1, if item i is in S', 

QLiS = % 

( 0, otherwise, 

for each i = 1, 2, . . . , n and each S E F. Finally, for any S, S G F, let 



f 1 , if the items in S are packed in a single bin with no other items, 
\ 0, otherwise. 



In the set-covering formulation of the bin-packing problem, the objective is to 
select a minimum number of feasible bins such that each item is included in some 
bin. It is the following integer program. 



Problem P 



Min 

s.t. 



E 

seF 



ys 



^2ysUis > 1, Vi = 1,2, 

SeF 

ys e {0,1}, V5eF. 



(19.12) 



Let Z* be the optimal solution to Problem P, and let Z LP be the optimal solution 
to the linear relaxation of Problem P. We want to prove that 



Z* < 2 Z LP . 



(19.13) 



(a) Formulate the dual of the linear relaxation of Problem P. 

( b ) Show that Yll Li w i ^ Z LP . 

( c ) Argue that Z* < 2 Conclude that (19.13) holds. 

(d) An alternative formulation to Problem P is obtained by replacing constraints 
(19.12) with equality constraints. Call the new problem Problem PE. Show 
that the optimal solution value of the linear relaxation of Problem P equals 
the optimal solution value of the linear relaxation of Problem PE. 
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Exercise 19.4. Recall the dynamic program given by (19.6). Let 




Consider the function defined as follows: 



9q(i) — "b fq-q'+Wi (^)}j 

Wi<q'<q 



for each i £ N and Wi < q < Q. Now define g = mm ie N min w .< q <Q g q (i). Show 
that / = g. 

Exercise 19.5. Develop a dynamic programming procedure for the column- 
generation step similar to f q (i) that avoids two- loops (loops of the type . . . i, j, i . . .). 
What is the complexity of this procedure? 

Exercise 19.6. Develop a dynamic programming procedure for the column- 
generation step in the presence of time- window constraints. What is required of 
the time- window data in order for this to be possible? What is the complexity of 
your procedure? 

Exercise 19.7. Develop a dynamic programming procedure for the column- 
generation step in the presence of a distance constraint on the length of any route. 
What is required of the distance data in order for this to be possible? What is the 
complexity of your procedure? 

Exercise 19.8. Consider an instance of the VRPTW with n customers. Given a 
subset of the customers S, let 6*(S) be the minimum number of vehicles required 
to carry the demands of customers in S; that is, b*(S) is the solution to the bin- 
packing problem defined on the demands of all customers in S. For i = 1, 2, . . . , n 
and j = 1 , 2, , . . , n, let 



Let 0 denote the depot and define c^- as the cost of traveling directly between 
points i and j, for = 0, 1, 2, ... , n. Let ti represent the time a vehicle arrives 
at the location of customer i; also, for every i and j, such that i < j, define 
Mij = max{/i + dij — ej, 0}, where dij = \\Yi — Yj\\. Then the following is a valid 
formulation of the VRPTW: 

Problem P' : Min E C-ij %ij 



{ 



1, if a vehicle travels directly between points i and j, 
0, otherwise. 




i<j 



i>j 





ijes 
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ei < U < li — Si, 1 < i < n, 

U + Si + dij - tj < Mij( 1 - Xij), 1 < i < j < n, 
Xij e { 0 , 1 }, 1 < i < j < n, 

x 0j e { 0 , 1 , 2 }, j = 1,2, ... ,n. 



(19.14) 

(19.15) 



The case xoj = 2 corresponds to a vehicle serving only customer j. The linear 
programming relaxation of P' is obtained by replacing constraints (19.14) and 
(19.15) by their linear equivalents. 

Construct an instance of the VRPTW in which the fractional and integer solu- 
tions to the above linear program do not approach the same value asymptotically. 
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Network Planning 



20.1 Introduction 

In this chapter, we present some of the issues involved in the practice of supply 
chain design and planning. These are issues that are often not dealt with in tra- 
ditional operations research analyses. However, they are essential in transforming 
raw data and problem characteristics into modeling assumptions, input data, and 
decisions. 

Our focus is on what we call network planning — the process by which the firm 
structures and manages the supply chain in order to 

• find the right balance among inventory, transportation, and manufacturing 
costs, 

• match supply and demand under uncertainty by positioning inventory effec- 
tively, 

• utilize resources effectively in a dynamic environment. 

Of course, this is a complex process, which requires a hierarchical approach in 
which decisions on network design, inventory positioning and management, as well 
as resource utilization are combined to reduce cost and increase service level. Thus, 
we divide the network planning process into three steps: 

1. Network design: This includes decisions on the number, locations, and size 
of manufacturing plants and warehouses, the assignment of retail outlets to 
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warehouses, and so on. Major sourcing decisions are also made at this point, 
and the typical planning horizon is a few years. 

2. Inventory positioning: This includes identifying stocking points as well as 
selecting facilities that will produce to stock, and thus keep inventory, and 
facilities that will produce to order and hence keep no inventory. 

3. Resource allocation: Given the structure of the logistics network and the 
location of stocking points, the objective in this step is to determine when 
and how much to produce or purchase and where and when to store inven- 
tory. These decisions require identifying the optimal tradeoff between setup 
costs and times, and inventory and transportation costs, taking into account 
production, sourcing, and warehousing capacities as well as other business 
rules and constraints. 

In this chapter, we analyze each of these steps and provide examples of the 
processes involved. 



20.2 Network Design 

Network design determines the physical configuration and infrastructure of the 
supply chain. As explained in Chap. 1, network design is a strategic decision that 
has a long-lasting effect on the firm. Network design involves decisions related to 
plant and warehouse location as well as distribution and sourcing. 

The supply chain infrastructure typically needs to be reevaluated due to changes 
in demand patterns, product mix, production processes, sourcing strategies, or the 
cost of running facilities. In addition, mergers and acquisitions may mandate the 
integration of different logistics networks. 

In the discussion below, we concentrate on the following key strategic decisions: 

1. determining the appropriate number of facilities such as plants and ware- 
houses; 

2. determining the location of each facility; 

3. determining the size of each facility; 

4. allocating space for products in each facility; 

5. determining the production requirements in each plant; 

6. determining sourcing requirements; 

7. determining distribution. 

The objective is to design or reconfigure the logistics network in order to 
minimize annual system- wide cost, including production and purchasing costs, 
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inventory holding costs, facility costs (storage, handling, and fixed costs), and 
transportation costs, subject to a variety of service-level requirements. In this set- 
ting, the tradeoffs are clear. Increasing the number of warehouses typically yields 

• an improvement in the service level due to the reduction in the average travel 
time to the customers, 

• an increase in inventory costs due to increased safety stocks required to 
protect each warehouse against uncertainties in customer demands, 

• an increase in overhead and setup costs, 

• a reduction in outbound transportation costs: transportation costs from the 
warehouses to the customers, 

• an increase in inbound transportation costs: transportation costs from the 
suppliers and/or manufacturers to the warehouses. 

In essence, the firm must balance the costs of opening new warehouses with 
the advantages of being close to the customer. Thus, warehouse-location decisions 
are crucial determinants of whether the supply chain is an efficient channel for 
the distribution of products. 

We describe below some of the issues related to data collection and the cal- 
culation of costs required for the optimization models. Some of the information 
provided is based on logistics textbooks such as Ballou (1992), Johnson and Wood 
(1986), and Robeson and Copacino (1994). 

Figures 20.1 and 20.2 present two screens of a typical advance planning system 
(APS); the user would see these screens at different stages of optimization. One 
screen represents the network prior to optimization, and the other represents the 
optimized network. 

Data Collection 

A typical network configuration problem involves large amounts of data, includ- 
ing information on 

1. the locations of customers, retailers, existing warehouses and distribution 
centers, manufacturing facilities, and suppliers, 

2. all products, including volumes, and special transport modes (e.g., refriger- 
ated) , 

3. annual demand for each product by customer location, 

4. transportation rates by mode, 

5. warehousing costs, including labor, inventory carrying charges, and fixed 
operating costs, 

6. shipment sizes and frequencies for customer delivery, 



382 



20. Network Planning 




FIGURE 20.1. The APS screen representing data prior to optimization 
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FIGURE 20.2. The APS screen representing the optimized logistics network 
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7. order processing costs, 

8. customer service requirements and goals, 

9. production and sourcing costs and capacities. 



Data Aggregation 

A quick look at the above list suggests that the amount of data involved in any 
optimization model for this problem is overwhelming. For instance, a typical soft 
drink distribution system has between 10,000 and 120,000 accounts (customers). 
Similarly, in a retail logistics network, such as Wal-Mart or JC Penney, the number 
of different products that flow through the network is in the thousands or even 
hundreds of thousands. 

For that reason, an essential first step is data aggregation. This is carried out 
using the following procedure: 

1. Customers located in close proximity to each other are aggregated using a 
grid network or other clustering technique. All customers within a single cell 
or a single cluster are replaced by a single customer located at the center 
of the cell or cluster. This cell or cluster is referred to as a customer zone. 
A very effective technique that is commonly used is to aggregate customers 
according to the five-digit or three-digit zip code. Observe that if customers 
are classified according to their service levels or frequency of delivery, they 
will be aggregated together by classes. That is, all customers within the same 
class are aggregated independently of the other classes. 

2. Items are aggregated into a reasonable number of product groups, based on 

(a) Distribution pattern. All products picked up at the same source and 
destined to the same customers are aggregated together. Sometimes 
there is a need to aggregate not only by distribution pattern but also 
by logistics characteristics, such as weight and volume. That is, consider 
all products having the same distribution pattern. Within these prod- 
ucts, we aggregate those SKUs with similar volume and weight into one 
product group. 

(b) Product type. In many cases, different products might simply be vari- 
ations in product models or style or might differ only in the type of 
packaging. These products are typically aggregated together. 

An important consideration, of course, is the impact on the model’s effectiveness 
of replacing the original detailed data with the aggregated data. We address this 
question in two ways. 

1. Even if the technology exists to solve the logistics network design prob- 
lem with the original data, it may still be useful to aggregate data because 
our ability to forecast customer demand at the account and product levels 
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FIGURE 20.3. The APS screen representing data prior to aggregation 

is usually poor. Because of the reduction in variability achieved through 
aggregation, forecast demand is significantly more accurate at the aggre- 
gated level. 

2. Various researchers report that aggregating data into about 150-200 points 
usually results in no more than a 1 % error in the estimation of total trans- 
portation costs; see Ballou (1992) and House and Jamie (1981). 

In practice, the following approach is typically used when aggregating the data: 

• Aggregate demand points for 150-200 zones. If customers are classified into 
classes according to their service levels or frequency of delivery, each class 
will have 150-200 aggregated points. 

• Make sure each zone has approximately an equal amount of total demand. 
This implies that the zones may be of different geographic sizes. 

• Place the aggregated points at the center of the zone. 

• Aggregate the products into 20-50 product groups. 

Figure 20.3 presents information about 3,220 customers all located in North 
America, while Fig. 20.4 shows the same data after aggregation using a three-digit 
zip code, resulting in 217 aggregated points. 
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FIGURE 20.4. The APS screen representing data after aggregation 

Transportation Rates 

The next step in constructing an effective distribution- network design model is to 
estimate transportation costs. An important characteristic of most transportation 
rates, including truck, rail, and others, is that the rates are almost linear with 
distance but not with volume. We distinguish here between transportation costs 
associated with an internal and an external fleet. 

Estimating transportation costs for company-owned trucks is typically quite 
simple. It involves annual costs per truck, annual mileage per truck, annual amount 
delivered, and the truck’s effective capacity. All this information can be used to 
easily calculate the cost per mile per SKU. 

Incorporating transportation rates for an external fleet into the model is more 
complex. We distinguish here between two modes of transportation: truckload, 
referred to as TL, and less than truckload, referred to as LTL. 

In the United States, TL carriers subdivide the country into zones. Almost every 
state is a single zone, except for certain big states, such as Florida or New York, 
which are partitioned into two zones. The carriers then provide their clients with 
zone-to- zone table costs. This database provides the cost per mile per truckload 
between any two zones. For example, to calculate the TL cost from Chicago, 
Illinois, to Boston, Massachusetts, one needs to get the cost per mile for this pair 
and multiply it by the distance from Chicago to Boston. An important property 
of the TL cost structure is that it is not symmetric; that is, it is typically more 
expensive to ship a fully loaded truck from Illinois to New York than from New 
York to Illinois. 



386 



20. Network Planning 



In the LTL industry, the rates typically belong to one of three basic types of 
freight rates: class, exception, and commodity. The class rates are standard rates 
that can be found for almost all products or commodities shipped. They are found 
with the help of a classification tariff that gives each shipment a rating or a class. 
For instance, the railroad classification includes 31 classes, ranging from 400 to 
13, that are obtained from the widely used Uniform Freight Classification. The 
National Motor Freight Classification, on the other hand, includes only 23 classes, 
ranging from 500 to 35. In all cases, the higher the rating or class, the greater the 
relative charge for transporting the commodity. There are many factors involved 
in determining a product’s specific class. These include product density, ease or 
difficulty of handling and transporting, and liability for damage. 

Once the rating is established, it is necessary to identify the rate basis number. 
This number is the approximate distance between the load’s origin and destination. 
With the commodity rating or class and the rate basis number, the specific rate 
per hundred pounds (hundred weight, or cwt) can be obtained from a carrier tariff 
table (i.e., a freight rate table). 

The two other freight rates, namely, exception and commodity, are specialized 
rates used to provide either less expensive rates (exception) , or commodity-specific 
rates (commodity). For an excellent discussion, see Johnson and Wood (1986) and 
Patton (1994). Most carriers provide a database file with all of their transportation 
rates; these databases are typically incorporated into advance planning systems. 

The proliferation of LTL carrier rates and the highly fragmented nature of 
the trucking industry have created the need for sophisticated rating engines. 
An example of such a rating engine that is widely used is SMC3’s RateWare (see 
www.smc3.com). This engine can work with various carrier tariff tables as well as 
SMC3’s CzarLite, one of the most widely used and accepted forms of nationwide 
LTL zip code-based rates. Unlike an individual carrier’s tariff, CzarLite offers a 
market-based price list derived from studies of LTL pricing on a regional, interre- 
gional, and national basis. This provides shippers with a fair pricing system and 
prevents any individual carrier’s operational and marketing bias from overtly inf- 
luencing the shipper choice. Consequently, CzarLite rates are often used as a base 
for negotiating LTL contracts among shippers, carriers, and third-party logistics 
providers. 

In Fig. 20.5, we provide the LTL cost charged by one carrier for shipping 4,000 
pounds as a function of the distance from Chicago. The cost is given for two 
classes, class 100 and class 150. As you can see, in this case, the transportation 
cost function is not linear with distance. 

Warehouse Costs 

Warehousing and distribution center costs include three main components: 

1. Handling costs: These include labor and utility costs that are proportional 
to annual flow through the warehouse. 
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FIGURE 20.5. Transportation rates for shipping 4,000 lb 



2. Fixed costs: These capture all cost components that are not proportional 
to the amount of material that flows through the warehouse. The fixed cost 
is typically proportional to warehouse size (capacity); the cost is a stepwise 
function of the warehouse capacity: That is, this cost is fixed in certain ranges 
of the warehouse size. 

3. Storage costs. These represent inventory holding costs, which are propor- 
tional to average positive inventory levels. 

Thus, estimating the warehouse handling costs is fairly easy, while estimating the 
other two cost values is quite difficult. To see this difference, suppose that during 
the entire year 1,000 units of product are required by a particular customer. These 
1,000 units are not required to flow through the warehouse at the same time, 
so the average inventory level will likely be significantly lower than 1,000 units. 
Thus, when constructing the data for the APS, we need to convert these annual 
flows into actual inventory amounts over time. Similarly, annual flow and average 
inventory associated with this product tell us nothing about how much space is 
needed for the product in the warehouse. This is true because the amount of space 
that the warehouse needs is proportional to peak inventory, not annual flow or 
average inventory. 

An effective way to overcome this difficulty is to utilize the inventory turnover 
ratio. This is defined as the annual sales divided by the average inventory level. 
Specifically, in our case, the inventory turnover ratio is the ratio of the total annual 
outflow from the warehouse to the average inventory level. Thus, the average 
inventory level is the total annual flow divided by the inventory turnover ratio. 
Multiplying the average inventory level by the inventory holding cost gives the 
annual storage costs. Finally, to calculate the fixed cost, we need to estimate the 
warehouse capacity. This is done in the next subsection. 



Warehouse Capacities 

Another important input to the distribution-network design model is the actual 
warehouse capacity. It is not immediately obvious, however, how to estimate 
the actual space required, given the specific annual flow of material through the 
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warehouse. Again, the inventory turnover ratio suggests an appropriate approach. 
As before, the annual flow through a warehouse divided by the inventory turnover 
ratio allows us to calculate the average inventory level. Assuming a regular ship- 
ment and delivery schedule, such as that given in Fig. 7.1, it follows that the 
required storage space is approximately twice that amount. In practice, of course, 
every pallet stored in the warehouse requires an empty space to allow for access and 
handling; thus, considering this space as well as space for aisles, picking, sorting, 
and processing facilities, as well as automatic guided vehicles (AGVs), we typically 
multiply the required storage space by a factor (> 1). This factor depends on the 
specific application and allows us to assess the amount of space available in the 
warehouse more accurately. A typical factor used in practice is 3. This factor would 
be used in the following way. Consider a situation where the annual flow through 
the warehouse is 1,000 units and the inventory turnover ratio is 10.0. This implies 
that the average inventory level is about 100 units and, hence, if each unit takes 
10 ft 2 of floor space, the required space for the products is 2,000 ft 2 . Therefore, the 
total space required for the warehouse is about 6,000 ft 2 . 

Potential Warehouse Locations 

It is also important to effectively identify potential locations for new warehouses. 
Typically, these locations must satisfy a variety of conditions: 

• geographical and infrastructure conditions; 

• natural resources and labor availability; 

• local industry and tax regulations; 

• public interest. 

As a result, only a limited number of locations would meet all the requirements. 
These are the potential location sites for the new facilities. 

Service-Level Requirements 

There are various ways to define service levels in this context. For example, 
we might specify a maximum distance between each customer and the warehouse 
serving it. This ensures that a warehouse will be able to serve its customers within 
a reasonable time. Sometimes we must recognize that for some customers, such as 
those in rural or isolated areas, it is harder to provide the same level of service that 
most other customers receive. In this case, it is often helpful to define the service 
level as the proportion of customers whose distance to their assigned warehouse 
is no more than a given distance. For instance, we might require that 95 % of the 
customers be situated within 200 miles of the warehouses serving them. 

Future Demand 

As observed in Chap. 1, decisions at the strategic level, which include 
distribution- network design, have a long-lasting effect on the firm. In particular, 
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decisions regarding the number, location, and size of warehouses have an impact 
on the firm for at least the next three to five years. This implies that changes 
in customer demand over the next few years should be taken into account when 
designing the network. This is most commonly addressed using a scenario-based 
approach incorporating net present value calculations. For example, various pos- 
sible scenarios representing a variety of possible future demand patterns over the 
planning horizon can be generated. These scenarios can then be directly incorpo- 
rated into the model to determine the best distribution strategy. 

Model and Data Validation 

The previous subsections document the difficulties in collecting, tabulating, and 
cleaning the data for a network-configuration model. Once this is done, how do we 
ensure that the data and model accurately reflect the network-design problem? 

The process used to address this issue is known as model and data validation. 
This is typically done by reconstructing the existing network configuration using 
the model and collected data, and comparing the output of the model to existing 
data. 

The importance of validation cannot be overstated. Valuable output of the 
model configured to duplicate current operating conditions includes all costs — 
warehousing, inventory, production, and transportation — generated under the cur- 
rent network configuration. These data can be compared to the company’s acc- 
ounting information. This is often the best way to identify errors in the data, 
problematic assumptions, modeling flaws, and so forth. 

In one project we are aware of, for example, the transportation costs calculated 
during the validation process were consistently underestimating the costs suggested 
by the accounting data. After a careful review of the distribution practices, the 
consultants concluded that the effective truck capacity was only about 30 % of the 
truck’s physical capacity; that is, trucks were being sent out with very little load. 
Thus, the validation process not only helped calibrate some of the parameters used 
in the model but also suggested potential improvements in the utilization of the 
existing network. 

It is often also helpful to make local or small changes in the network configu- 
ration to see how the system estimates their impact on costs and service levels. 
Specifically, this step involves positing a variety of what-if questions. This includes 
estimating the impact of closing an existing warehouse on system performance. Or, 
to give another example, it allows the user to change the flow of material through 
the existing network and see the changes in the costs. Often, managers have good 
intuition about what the effect of these small-scale changes on the system should 
be, so they can more easily identify errors in the model. Intuition about the effect 
of radical redesign of the entire system is often much less reliable. To summarize, 
the model- validation process typically involves answering the following questions: 

• Does the model make sense? 



• Are the data consistent? 
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• Can the model results be fully explained? 

• Did you perform sensitivity analysis? 

Validation is critical for determining the validity of the model and data, but 
the process has other benefits. In particular, it helps the user make the connection 
between the current operations, which were modeled during the validation process, 
and possible improvements after optimization. 

Key Features of a Network- Configuration APS 

One of the key requirements of any advance planning system for network design 
is flexibility. In this context, we define flexibility as the ability of the system to 
incorporate a large set of preexisting network characteristics. Indeed, depending 
on the particular application, a whole spectrum of design options may be appro- 
priate. At one end of this spectrum is the complete reoptimization of the existing 
network. This means that each warehouse can be either opened or closed and all 
transportation flows can be redirected. At the other end of the spectrum, it may 
be necessary to incorporate the following features in the optimization model: 

1. Customer-specific service- level requirements. 

2. Existing warehouses. In most cases, warehouses already exist and their leases 
have not yet expired. Therefore, the model should not permit the closing of 
these warehouses. 

3. Expansion of existing warehouses. Existing warehouses may be expandable. 

4. Specific flow patterns. In a variety of situations, specific flow patterns (e.g., 
from a particular warehouse to a set of customers) should not be changed, 
or perhaps more likely, a certain manufacturing location does not or cannot 
produce certain SKUs. 

5. Warehouse-to- warehouse flow. In some cases, material may flow from one 
warehouse to another warehouse. 

6. Production and bill of materials. In some cases, assembly is required and 
needs to be captured by the model. For this purpose, the user needs to 
provide information on the components used to assemble finished goods. In 
addition, production information down to the line level can be included in 
the model. 

It is not enough for the advance planning system to incorporate all of the features 
described above. It also must have the capability to deal with all these issues 
with little or no reduction in its effectiveness. The latter requirement is directly 
related to the so-called robustness of the system. This stipulates that the relative 
quality of the solution generated by the system (i.e., cost and service level) should 
be independent of the specific environment, the variability of the data, or the 
particular setting. If a particular advance planning system is not robust, it is 
difficult to determine how effective it will be for a particular problem. 
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Important questions when designing the logistics network and when managing 
inventory in a complex supply chain are where to keep safety stock and, similarly, 
which facilities should produce to stock and which should produce to order? The 
answers to these questions clearly depend on the desired service level, the supply 
network, lead times as well as a variety of operational issues and constraints. 
Thus, our focus is on a strategic model that allows the firm to position safety 
stock effectively in its supply chain. 

To illustrate the tradeoffs and the impact of strategically positioning safety stock 
in the supply chain, consider the following example. 

ElecComp Inc. is a large contract manufacturer of circuit boards and other high- 
tech parts. The company sells about 27,000 high-value products whose life cycle 
is relatively short. Competition in this industry forces ElecComp to commit to 
short lead times to its customers; this committed service time to the customers is 
typically much shorter than manufacturing lead time. Unfortunately, the manu- 
facturing process is quite complex, including a complex sequence of assemblies at 
different stages. 

Because of the long manufacturing lead time and the pressure to provide cus- 
tomers with a short response time, ElecComp kept inventory of finished products 
for many of its SKUs. Thus, the company managed its supply chain based on 
long-term forecast, the so-called push-based supply chain strategy. This make-to- 
stock environment required the company to build safety stock and resulted in huge 
financial and shortage risks. 

Executives at ElecComp had long recognized that this push-based supply chain 
strategy was not the appropriate strategy for their supply chain. Unfortunately, 
because of the long lead time, a pull-based supply chain strategy, in which manu- 
facturing and assembly are done based on realized demand, was not appropriate 
either. 

Thus, ElecComp focused on developing a new supply chain strategy whose obj- 
ectives are 

1. reducing inventory and financial risks, 

2. providing customers with competitive response times. 

This could be achieved by 

• determining the optimal location of inventory across the various stages of 
the manufacturing and assembly process, 

• calculating the optimal quantity of safety stock for each component at each 
stage. 

Thus, the focus of redesigning ElecComp’s supply chain was on a hybrid strategy 
in which a portion of the supply chain is managed based on push, that is, a make- 
to- stock environment, while the remaining portion of the supply chain is managed 
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based on pull, or a make-to-order strategy. Evidently, the supply chain stages 
that produce to stock will be the locations where the company keeps safety stock, 
while the make-to-order stages will keep no stock at all. Hence, the challenge 
was to identify the location in the supply chain in which the strategy is switched 
from a push-based, namely, a make-to-stock strategy, to a pull-based, that is, a 
make-to-order supply chain strategy. This location is referred to as the push-pull 
boundary. 

ElecComp developed and implemented the new push-pull supply chain strategy, 
and the impact was dramatic! For the same customer lead times, safety stock was 
reduced by 40-60%, depending on product line. More importantly, with the new 
supply chain structure, ElecComp concluded that they could cut lead times to 
their customers by 50 % and still enjoy a 30 % reduction in safety stock. 

Below we describe how this was achieved for a number of product lines. 

An Illustrative Example 

To understand the analysis and the benefit experienced by ElecComp, consider 
Fig. 20.6, in which a finished product (Part I) is assembled in a Dallas facility from 
two components, one produced in the Montgomery facility and one in a different 
facility in Dallas. Each box provides information about the value of the product 
produced by that facility; numbers under each box are the processing time at that 
stage; bins represent safety stock. Transit times between facilities are provided as 
well. Finally, each facility provides a committed response time to the downstream 
facilities. For instance, the assembly facility quotes 30 days’ response time to its 
customers. This implies that any order can be satisfied in no more than 30 days. 
The Montgomery facility quotes an 88-day response time to the assembly facility. 
As a result, the assembly facility needs to keep inventory of finished products in 
order to satisfy customer orders within its 30 days’ committed service time. 
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FIGURE 20.7. Current safety stock locations 

Observe that if somehow ElecComp can reduce the committed service time from 
the Montgomery facility to the assembly facility from 88 days to say 50 or perhaps 
40 days, the assembly facility will be able to reduce its finished goods inventory 
while the Montgomery facility will need to start building inventory. Of course, 
ElecComp’s objective is to minimize system-wide inventory and manufacturing 
costs; this is precisely what Inventory Analyst™ from LogicTools allows users to 
do. By looking at the entire supply chain, the tool determines the appropriate 
inventory level at each stage. 

For instance, if the Montgomery facility reduces its committed lead time to 
13 days, then the assembly facility does not need any inventory of finished goods. 
Any customer order will trigger an order for Parts II and III. Part II will be 
available immediately, since the facility producing Part II holds inventory, while 
Part III will be available at the assembly facility in 15 days: 13 days’ committed 
response time by the manufacturing facility plus 2 days’ transportation lead time. 
It takes another 15 days to process the order at the assembly facility; therefore, the 
order will be delivered to the customers within the committed service time. Thus, 
in this case, the assembly facility produces to order, that is, a pull-based strategy, 
while the Montgomery facility needs to keep inventory and hence is managed based 
on push, that is, a make-to-stock strategy. 

Now that the tradeoffs are clear, consider the product structure depicted in 
Fig. 20.7. Light boxes (parts 4, 5, and 7) represent outside suppliers, whereas dark 
boxes represent internal stages within ElecComp’s supply chain. Observe that the 
assembly facility commits a 30-day response time to the customers and keeps 
inventory of finished goods. More precisely, the assembly facility and the facility 
manufacturing Part II both produce to stock. All other stages produce to order. 

Figure 20.8 depicts the optimized supply chain that provides customers with the 
same 30-day response time. Observe that by adjusting the committed service time 
of various internal facilities, the assembly system starts producing to order and 
keeps no finished goods inventory. On the other hand, the Raleigh and Montgomery 
facilities need to reduce their committed service time and hence keep inventory. 
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FIGURE 20.8. Optimized safety stock locations 
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FIGURE 20.9. Optimized safety stock with reduced lead time 



So where is the push and where is the pull in the optimized strategy? Evidently, 
the assembly facility and the Dallas facility that produces Part II both operate 
now in a make-to-order fashion — a pull strategy — while the Montgomery facility 
operates in a make-to-stock fashion, a push-based strategy. The impact on the 
supply chain is a 39 % reduction in safety stock! 

At this point it was appropriate to analyze the impact of a more aggressive 
quoted lead time to the customers. That is, ElecComp executives considered red- 
ucing quoted lead times to the customers from 30 to 15 days. Figure 20.9 depicts 
the optimized supply chain strategy in this case. The impact was clear. Relative 
to the baseline (Fig. 20.7), inventory was down by 28%, while response time to 
the customers is halved. 

Finally, Figs. 20.10 and 20.11 present a more complex product structure. Fig- 
ure 20.10 provides information about the supply chain strategy before optimiza- 
tion, and Fig. 20.11 depicts the supply chain strategy after optimizing the push- 
pull boundary as well as inventory levels at different stages in the supply chain. 
Again, the benefit is clear. By correctly selecting which stage is going to produce 
to order and which is producing to stock, ElecComp reduced the inventory cost by 
more than 60 % while maintaining the same quoted lead time to the customers. 
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Summary 

Using a multistage inventory optimization technology (Inventory Analyst™ 
from LogicTools), ElecComp was able to significantly reduce inventory cost while 
maintaining and sometimes significantly decreasing quoted service times to the 
customers. This was achieved by 

1. Identifying the push-pull boundary; that is, identifying supply chain stages 
that should operate in a make-to-stock fashion and hence keep safety stock. 
The remaining supply chain stages operate in a make-to-order fashion and 
thus keep no inventory. This is done by pushing inventory to less costly 
locations in the supply chain. 

2. Taking advantage of the risk-pooling concept. This concept suggests that 
demand for a component used by a number of finished products has smaller 
variability and uncertainty than that of the finished goods. 

3. Replacing traditional supply chain strategies that are typically referred to 
as sequential, or local, optimization by a globally optimized supply chain 
strategy. In a sequential, or local, optimization strategy, each stage tries to 
optimize its profit with very little regards to the impact of its decisions on 
other stages in the same supply chain. On the other hand, in a global supply 
chain strategy, one considers the entire supply chain and identifies strategies 
for each stage that will maximize the supply chain performance. 
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FIGURE 20.11. Optimized supply chain 



To better understand the impact of the new supply chain paradigm employed 
by ElecComp, consider Fig. 20.12, where we plot the total inventory cost against 
the quoted lead time to the customers. The upper tradeoff curve represents the 
traditional relationship between cost and quoted lead time to the customers. This 
curve is a result of locally optimizing decisions at each stage in the supply chain. 
The lower tradeoff curve is the one obtained when the firm globally optimizes the 
supply chain by locating the push-pull boundary correctly. 

Observe that this shift of the tradeoff curve, due to optimally locating the push- 
pull boundary, implies the following: 

1. For the same quoted lead time, the company can significantly reduce cost, 
or 

2. For the same cost, the firm can significantly reduce lead time. 

Finally, notice that the curve representing the traditional relationship between 
cost and customer quoted lead time is smooth, while the new tradeoff curve rep- 
resenting the impact of optimally locating the push-pull boundary is not, with 
jumps in various places. These jumps represent situations in which the location of 
the push-pull boundary changes and significant cost savings are achieved. 

Our experience is that those employing the new supply chain paradigm, like Elec- 
Comp, typically chose a supply chain strategy that both reduce cost andcustomer 
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FIGURE 20.12. Global vs. local optimization 



quoted lead time. This strategy allows ElecComp to satisfy demand much faster 
than their competitors and develop a cost structure that enables competitive 
pricing. 



20.4 Resource Allocation 

Supply chain master planning is defined as the process of coordinating and all- 
ocating production and distribution strategies and resources to maximize profit 
or minimize systemwide cost. In this process, the firm considers forecast demand 
for the entire planning horizon, such as the next 52 weeks, as well as safety stock 
requirements. The latter are determined, for instance, based on models similar to 
the one analyzed in the previous section. 

The challenge of allocating production, transportation, and inventory resources 
in order to satisfy demand can be daunting. This is especially true when the firm 
is faced with seasonal demand, limited capacities, competitive promotions, or high 
volatility in forecasting. Indeed, decisions such as when and how much to produce, 
where to store inventory, and whether to lease additional warehouse space may 
have enormous impact on supply chain performance. 

Traditionally, the supply chain planning process was performed manually with 
a spreadsheet and was done by each function in the company independently of 
other functions. That is, the production plan would be determined at the plant, 
independently from the inventory plan, and would typically require the two plans 
to be somehow coordinated at a later time. This implies that divisions typically 
end up “optimizing” just one parameter, usually production costs. 

In modern supply chains, however, this sequential process is replaced by a pro- 
cess that takes into account the interaction between the various levels of the supply 
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FIGURE 20.13. The extended supply chain: from manufacturing to order fulfillment 



chain and identifies a strategy that maximizes supply chain performance. This is 
referred to as global optimization, and it necessitates the need for an optimization- 
based advance planning system. These systems, which model the supply chain as 
large-scale mixed integer linear programs, are analytical tools capable of consider- 
ing the complexity and dynamic nature of the supply chain. 

Typically, the output from the tool is an effective supply chain strategy that 
coordinates production, warehousing, transportation, and inventory decisions. The 
resulting plan provides information on production quantities, shipment sizes, and 
storage requirements by product, location, and time period. This is referred to as 
the supply chain master plan. 

In some applications, the supply chain master plan serves as an input for a 
detailed production scheduling system. In this case, the production scheduling 
system employs information about production quantities and due dates received 
from the supply chain master plan. This information is used to propose a detailed 
manufacturing sequence and schedule. This allows the planner to integrate the 
back end of the supply chain — manufacturing and production — and the front end 
of the supply chain — demand planning and order replenishment; see Fig. 20.13. 
This diagram illustrates an important issue. The focus of order replenishment 
systems, which are part of the pull portion of the supply chain, is on the service 
level. Similarly, the focus of the tactical planning, that is, the process by which 
the firm generates a supply chain master plan, which is in the push portion of the 
supply chain, is on cost minimization or profit maximization. Finally, the focus in 
the detailed manufacturing scheduling portion of the supply chain is on feasibility. 
That is, the focus is on generating a detailed production schedule that satisfies all 
production constraints and meets all the due-date requirements generated by the 
supply chain master plan. 

Of course, the output from the tactical planning process, namely, the supply 
chain master plan, is shared with supply chain participants to improve coordination 
and collaboration. For example, the distribution center managers can now better 
use this information to plan their labor and shipping needs. Distributors can share 
plans with their suppliers and customers in order to decrease costs for all partners 
in the supply chain and promote savings. Specifically, distributors can realign 
territories to better serve customers, store adequate amounts of inventory at the 
customer site, and coordinate overtime production with suppliers. 
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In addition, supply chain master planning tools can identify potential supply 
chain bottlenecks early in the planning process, allowing the planner to answer 
questions such as 

• Will leased warehouse space alleviate capacity problems? 

• When and where should the inventory for seasonal or promotional demand 
be built and stored? 

• Can capacity problems be alleviated by rearranging warehouse territories? 

• What impact do changes in the forecast have on the supply chain? 

• What will be the impact of running overtime at the plants or outsourcing 
production? 

• What plant should replenish each warehouse? 

• Should the firm ship by sea or by air? Shipping by sea implies long lead 
times and therefore requires high inventory levels. On the other hand, using 
air carriers reduces lead times and hence inventory levels but significantly 
increases transportation cost. 

• Should we rebalance inventory between warehouses or replenish from the 
plants to meet unexpected regional changes in demand? 

Another important capability that tactical planning tools have is the ability to 
analyze demand plans and resource utilization to maximize profit. This enables 
balancing the effect of promotions, new product introductions, and other planned 
changes in demand patterns and supply chain costs. Planners now are able to 
analyze the impact of various pricing strategies as well as identify markets, stores, 
or customers that do not provide the desired profit margins. 

A natural question is when should one focus on cost minimization and when on 
profit maximization? While the answer to this question may vary from instance to 
instance, it is clear that cost minimization is important when the structure of the 
supply chain is fixed or at times of a recession and therefore oversupply. In this 
case, the focus is on satisfying all demand at the lowest cost by allocating resources 
effectively. On the other hand, profit maximization is important at time of growth, 
namely, at the time when demand exceeds supply. In this case, capacity can be 
limited because of the use of limited natural resources or because of expensive 
manufacturing processes that are hard to expand, as is the case in the chemical 
and electronic industries. In these cases, deciding who to serve and for how much 
is more critical than costs savings. 

Finally, an effective supply chain master planning tool must also be able to help 
the planners improve the accuracy of the supply chain model. This, of course, 
is counterintuitive since the accuracy of the supply chain master planning model 
depends on the accuracy of the demand forecast that is an input to the model. How- 
ever, notice that the accuracy of the demand forecast is typically time-dependent. 
That is, the accuracy of forecast demand for the first few time periods, for ins- 
tance, the first 10 weeks, is much higher than the accuracy of demand forecast for 



400 



20. Network Planning 



later time periods. This suggests that the planner should model the early portion 
of the demand forecast at a great level of detail, that is, apply weekly demand 
information. On the other hand, demand forecasts for later time periods are not 
as accurate, and hence the planner should model the later demand forecast month 
by month or by groups of 2-3 weeks each. This implies that later demand fore- 
casts are aggregated into longer time buckets, and hence, due to the risk-pooling 
concept, the accuracy of the forecast improves. 

In summary, supply chain master planning helps address fundamental tradeoffs 
in the supply chain such as setup cost versus holding costs or production lot sizes 
versus capacities. It takes into account supply chain costs such as production, 
supply, warehousing, transportation, taxes, and inventory, as well as capacities 
and changes in the parameters over time. 

This example illustrates how supply chain master planning can be used dyn- 
amically and consistently to help a large food manufacturer manage the supply 
chain. The food manufacturer makes production and distribution decisions at the 
division level. Even at the division level, the problems tend to be large-scale. 
Indeed, a typical division may include hundreds of products, multiple plants, 
many production lines within a plant, multiple warehouses (including overflow 
facilities), bill-of-material structures to account for different packaging options, 
and a 52-week demand forecast for each product for each region. The forecast ac- 
counts for seasonality and planned promotions. The annual forecast is important 
because a promotion late in the year may require production resources relatively 
early in the year. Production and warehousing capacities are tight, and products 
have a limited shelf life that needs to be integrated into the analysis. Finally, the 
scope of the plan spans many functional areas, including purchasing, production, 
transportation, distribution, and inventory management. Traditionally, the supply 
chain planning process was performed manually with a spreadsheet and was done 
by each function in the company. That is, the production plan would be done at 
the plant, independently from the inventory plan, and would typically require the 
two plans to be somehow coordinated at a later time. This implies that divisions 
typically end up “optimizing” just one parameter, usually production costs. The 
tactical planning APS introduced in the company allows the planners to reduce 
systemwide cost and better utilize resources such as manufacturing and warehous- 
ing. Indeed, a detailed comparison of the plan generated by the tactical tool with 
the spreadsheet strategy suggests that the optimization-based tool is capable of 
reducing total costs across the entire supply chain. See Fig. 20.14 for illustrative 
results. 



20.5 Summary 

Optimizing supply chain performance is difficult because of conflicting objectives, 
demand and supply uncertainties, and supply chain dynamics. However, through 
network planning, which combines network design, inventory positioning, and 
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FIGURE 20.14. Comparison of manual vs. optimized scenarios 
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TABLE 20.1. Network planning characteristics 



resource allocation, the firm can globally optimize supply chain performance. This 
is achieved by considering the entire network, taking into account production, 
warehousing, transportation, and inventory costs, as well as service-level require- 
ments. 

Table 20.1 summarizes the key dimensions of each of the planning activities, net- 
work design, inventory positioning/management, and supply chain master plan- 
ning. The table shows that network design involves long-term plans, typically over 
years, is done at a high level and can yield high returns. The planning horizon for 
supply chain master planning is months or weeks, the frequency of replanning is 
high (e.g., every week), and it typically delivers quick results as well. Inventory 
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planning is focused on short-term uncertainty in demand, lead time, processing 
time, or supply. The frequency of replanning is high, such as monthly planning 
to determine appropriate safety stock based on the latest forecast and forecast 
error. Inventory planning can also be used more strategically to identify locations 
in the supply chain where the firm keeps inventory, as well as to identify stages 
that produce to stock and those that produce to order. 



20.6 Exercises 



Exercise 20.1. Consider n independent and identically distributed random vari- 
ables, Xi, X 2 , . . . , X n . Let S n = ^Y2i=i Xi- Find the variance of the random 
variable S n as a function of the variance of Xi. 
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21.1 Introduction 

We now turn our attention to a case study in transportation logistics. We highlight 
particular issues that arise when implementing an optimization algorithm in a real- 
life routing situation. The case concerns the routing and scheduling of school buses 
in the five boroughs of New York City. 

Many of the vehicle routing problems we have discussed so far (see Part II) 
have been simplified versions of the usually more complex problems that appear 
in practice. Typically, a vehicle routing problem will have many constraints on 
the types of routes that can be constructed, including multiple vehicle types, time 
and distance constraints, complex restrictions on what items can be in a vehicle 
together, and so forth. The problems that appear in the context of school bus 
routing and scheduling could be characterized as the most difficult types of vehi- 
cle routing problems since they have aspects of all these constraints. This is the 
problem we will consider here. 

School bus routing and scheduling is an area where, in general, computerized 
algorithms can have a large impact. User-friendly software that call routing and 
scheduling algorithms at the click of a button and that result in workable sol- 
utions can greatly affect the day-to-day operations of a dispatching unit. With 
increasingly affordable high-speed computing power in desktop computers and 
the possibility of displaying geographic information on-screen, it is not surprising 
that many communities are using expert systems to perform the daunting task of 
routing and scheduling their school buses. In most cases, this has led to improved 
solutions in fractions of the time that was previously required. 
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Unfortunately, providing workable solutions for such an application as this is 
not as simple as just “clicking” the right button. Anyone who has been involved 
in a real-life optimization application knows that much discussion is involved in 
determining what the problem is and how we are to “solve” it. In this chapter, we 
concern ourselves with some of the details that make it possible to put modeling 
assumptions and algorithms into action. 



21.2 The Setting 

The New York City school system is composed of 1,069 schools and approximately 
one million students. Most of these students either walk to school or are given pub- 
lic transportation passes. About 125,000 students ride school buses that are leased 
by the Board of Education. The majority, or about 83,000, of these are classified 
as General Education students. These students walk to their neighborhood bus 
stop in the morning and wait for a bus to take them to school. In the afternoon, a 
bus takes them from their school and drops them off back at their bus stop. The 
rest of these students with particular needs, classified as Special Education, are 
picked up and dropped off directly at their homes. 

This is one distinction that makes the transportation policies governing Special 
Ed students fundamentally different from those of General Ed students. Another 
fundamental difference is that, in many cases, Special Ed students enroll in schools 
with specific services and therefore may be bused over long distances. General Ed 
students usually go to schools only a few miles from their homes and almost 
exclusively to schools within the same borough. In addition, Special Ed students, 
such as wheelchair-bound students, are transported in specially designed vehicles 
with much smaller carrying capacities. 

For General Ed student transportation, currently the Board of Education leases 
approximately 1,150 buses a year. Many companies bid for the contract to trans- 
port the students; currently, the companies winning contracts are responsible for 
designing the routes. Independent of the company, the leasing cost to the Board is 
approximately $80,000 annually for each bus (and driver). The total yearly bud- 
get for General Education student transportation alone is therefore close to $100 
million. 

The routing of Special Education students is done differently. Using colored 
pins and large maps placed on walls, a team of inspectors/routers at the Board of 
Education Office of Pupil Transportation mark the students’ homes and schools. 
Then, using their knowledge of the geography and street conditions acquired 
through their many years of work, they literally string pins together to form 
routes. Although the inspectors clearly do this well, this is very time-consuming. 
For example, a group of five people took approximately three months to manually 
generate routes just for the Borough of Manhattan. 

Several years ago, the New York City Board of Education appropriated funds to 
develop a computerized system, called CATS (computer-assisted transportation 
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system). This system is supposed to help in the design of routes for both the 
General and Special Ed students. The project consists of three phases, discussed 
next. 

Phase I: Replicate the pinning- and- stringing approach on a computer. The purpose 
of this phase is to emulate on the computer screen what was previously done with 
maps, pins, and string. First, a database is needed to keep track of all relevant 
student and school information. The student data consist of address, bus stop, 
and school. For each school, the data consist of an address, as well as starting 
and ending times for all sessions. This makes data easily retrievable and updat- 
able, and provides some of the basic information that is needed for routing and 
scheduling. In addition to the database, a method of generating maps on the com- 
puter is needed as well; this is the geographic information system (GIS). These 
systems, widely available only in the last few years, truly offer a new dimension 
to many decision-support systems. With this software, color-coded objects des- 
ignating students or schools can easily be displayed on a computer screen. This 
enables the user to visualize the relative locations of important points. In addi- 
tion, the user can “click and drag” with a mouse and get information about the 
area outlined. This information can include U.S. Census data such as number of 
households, median age, income, etc. More importantly, in this application, by 
designating two points, the GIS can calculate exact locations (latitude and longi- 
tude coordinates) and also the distance between the two points along the street 
network. By “stringing” together a series of points, the software can give the total 
distance traveled. When this phase is completed, inspectors currently designing 
Special Ed routes will be able to “click” on bus stops with a mouse and “string” 
them together on the computer screen. This is the method called “blocking and 
stringing.” 

Phase II: Extend the functionality developed in Phase I to the General Education 
stop-to- school service. The goal is to create a system whereby one could construct 
routes for the General Ed population on the computer screen. For example, by 
choosing a set of schools with a mouse, the pertinent bus stops (those with stu- 
dents going to the set of schools) are highlighted. The inspector can then string 
together the stops and schools to form a route directly on the computer screen, or 
again let the computer determine a good route through the stops. The immedi- 
ate visualization of a possible solution (routes) along with relevant statistics (bus 
load, total travel time, total students picked up) makes it much easier to check 
the feasibility of the routes. This alone considerably simplifies the task of building 
efficient routes. 

Phase III: Create an optimization module. The aim here is to build software that 
uses the student and school data and the GIS to generate efficient bus routes and 
schedules meeting existing transportation policies. The software should include 
subroutines that check the feasibility of suggested routes or design routes for any 
subset of the population, be it a school, a district, a borough, or the entire city. 
This is the phase in which we are the most interested. 
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We present here a range of issues related to the development of this optimization 
module (Phase III) and to the problem of routing buses through the New York 
City streets. We focus on routing the General Ed students; the routing of Special 
Ed students is currently being done at the Office of Pupil Transportation using 
the “blocking-and-stringing” approach. 

In Sect. 21.3, we give a short summary of some of the important papers that 
have appeared in the literature in the area of school bus routing and scheduling 
and related vehicle routing problems. In Sect. 21.4, we present details of the school 
bus routing and scheduling problem in Manhattan. In Sect. 21.5, we give a brief 
overview of the methodologies we used to estimate distances, travel times, and the 
pickup and dropoff times. 

When a computerized system for this problem is being designed, it is important 
to consider the following questions. First, is it possible to design an algorithm that 
will generate quality solutions in a reasonable amount of computing time? Second, 
are routes constructed by the computerized system truly driveable ? Third, what is 
the best way to make these computerized algorithms of use to the people designing 
the routes? To answer the first two questions, we designed a school bus routing 
and scheduling algorithm and ran it on the Manhattan data. The algorithm is pre- 
sented in Sect. 21.6. To answer the third question, in Sect. 21.8, we discuss some 
of the ways in which a computerized system for school bus routing can be made 
more interactive. In Sect. 21.9, we present results on the Manhattan data. 



21.3 Literature Review 

In the operations research literature, we find quite a few references to the problem 
as well as many different solution techniques. A standard way the school bus 
routing and scheduling problem has been analyzed is to decompose it into two 
problems: a route- generation problem, where simple routes are designed (usually 
with only one school); and a route scheduling problem, where these routes are 
linked to form longer routes (routes that visit more than one school). 

As early as 1969, Newton and Thomas looked at a bus routing problem for a sin- 
gle school. Using some of the first local improvement procedures for vehicle routing 
problems, they designed a tour through all the bus stops and then partitioned it 
into smaller feasible routes that each could be covered by a bus. 

In 1972, Angel et al. considered a clustering approach to generating routes. First, 
bus stops are grouped by their proximity using a clustering algorithm. Then an 
attempt is made to find minimum-length routes through these clusters in such a 
way that the constraints are satisfied. Finally, some clusters are merged if this 
is feasible. The algorithm was applied to an instance consisting of approximately 
1,500 students and 5 schools in Indiana. 

In 1972, Bennett and Gazis considered the problem of generating routes. They 
modified the savings algorithm of Clarke and Wright (1964) (see Sect. 17.2). They 
also experimented with different objective functions, such as minimizing total 
student-miles. The problem considered had 256 bus stops and approximately 30 
routes in Toms River, New Jersey. 
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In 1979, Bodin and Berman used a 3-opt procedure to generate an initial 
traveling salesman tour that is then partitioned into feasible routes. This algo- 
rithm uses two additional components: a lookahead feature and a bus stop splitter. 
The lookahead feature allows the initial order to be changed slightly. The bus stop 
splitter allows a bus stop to be split into smaller bus stops. Two problems were 
studied. One dealt with a school district in a densely populated suburban area with 
13,000 students requiring bus transportation each day and 25 schools. A second 
district, also in a suburban area, had 4,200 students transported. 

In 1984, Swersey and Ballard addressed only the problem of scheduling a set 
of routes that had already been designed. Given a set of routes that delivered all 
students from their bus stops to their schools, the authors devised a method to find 
the minimum number of buses that could “cover” these routes. This scheduling 
problem can be formulated as a difficult integer program. The authors used some 
simple cutting planes to solve it heuristically. The size of the problem considered 
was approximately 30-38 buses and 100 routes. 

Finally, in 1986, Desrosiers et al. studied a bus routing problem in Montreal, 
Canada. Using several techniques, depending on whether the stops were in rural or 
urban areas, they generated a set of routes. To schedule them, they formulated the 
problem as an integer program and solved it using a column-generation approach. 
The problem solved had 60 schools and 20,000 students. 



21.4 The Problem in New York City 

Depending on how generally it is formulated, the school bus routing and scheduling 
problem can take many forms. In its most general form, the problem consists of 
a set of students distributed in a region who have to be brought to and from 
their schools every school day. The problem consists of determining bus stop loc- 
ations, assigning students to bus stops, and finally routing and scheduling the 
buses so as to minimize the total operating cost while following all transportation 
guidelines. The difficulty, of course, is that each of these subproblems is dependent 
and therefore should be looked at simultaneously. That is, any determination of 
bus stop locations, and who gets assigned to each, clearly has an impact on the 
routes and schedules of the buses. Hence, an integrated approach is required to 
avoid suboptimality. However, due to the complexity and the size of the problem, 
this has historically never been attempted. In addition, often it is not necessarily 
possible to reoptimize all aspects of the problem, such as bus stop locations or 
assignments. 

To understand why this problem is so complex, consider, for instance, the bus 
stop location problem on its own. There are numerous constraints and require- 
ments: No more than a certain number of students can be assigned to the same 
bus stop; bus stops cannot be within a certain distance of each other; each student 
must be within a short walk of the bus stop and must not cross a major thorough- 
fare; and so on. 
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In our case, the Board of Education decided that the bus stops that are currently 
being used will remain in use. Thus, the position of the bus stops as well as 
which students are assigned to each were assumed fixed. These stops satisfy all 
the requirements mentioned above. Our routing and scheduling problem thus starts 
with a set of bus stops, each with a particular number of students assigned to it 
destined for a particular school. Each school has starting and ending times for each 
session. In addition to bus stop and school data, it is assumed that the distance 
and travel time between any two points in the area are readily available. This issue 
will be discussed in more detail in Sect. 21.5. 

We formally define a route as follows. A route is a sequence of stops and possibly 
several schools that can be feasibly driven by one bus. For example, routes for the 
morning problem always start with a pickup at a stop and end with dropoff at a 
school. In contrast, an afternoon route always starts with a pickup at a school and 
ends with a dropoff at a stop. 

The goal is to design a set of minimum-cost routes satisfying all existing trans- 
portation guidelines. The major cost component to the Board of Education is the 
cost of leasing each bus and driver, and hence the objective is essentially to mini- 
mize the number of buses needed to feasibly transport the students. Clearly, safety 
is the first consideration, and it is the view of the Board of Education that bus 
routes that meet all transportation guidelines provide a high level of safety. The 
rest is up to the drivers. 

Route feasibility is the most complex aspect of the problem. There are numerous 
side constraints. First, the bus can hold only a limited number of students at one 
time ( capacity constraint). Second, each student must not be on the bus for more 
than a specific amount of time and/or distance ( time or distance constraint). This 
is motivated by the simple observation that the less time spent on the bus, the 
safer and more desirable it is for the students. And finally, there are restrictions 
on the time a bus can arrive at a school in the morning, and on the time a bus 
can leave the school in the afternoon ( time-window constraints). In many school 
bus routing and scheduling problems, transportation policies specify that students 
from different schools not be put on the same bus at the same time; that is, no 
mixed loads are allowed. Clearly, allowing mixed loads provides increased flexibility 
and therefore can lead to savings in cost. In New York City, for the most part, 
mixed loads are allowed. We list here the primary constraints. There are several 
other constraints, which we talk about in Sect. 21.7. 

We will deal exclusively here with the problem of delivering the students to 
their school in the morning. Researchers have noted that this problem is usually 
more critical than the afternoon problem for two reasons. First, in the afternoon, 
the time windows are usually less constraining. For example, in Manhattan (in 
the morning), school starting times fall between 7:30 and 9:00 a.m. That gives 
roughly a one-and-a-half-hour time window to pick up all students and take them 
to their schools. In the afternoon, schools end at times over a wider range: anywhere 
between 1:00 and 4:15 p.m. Second, traffic congestion is usually higher in the 
morning hours than in the afternoon hours when the students are being bused. 
Therefore, it is very likely more buses will be needed in the morning than in the 
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afternoon. Indeed, our computational experiments reported in Sect. 21.9 verify that 
this is true in Manhattan. Note that the solution found in the morning cannot be 
simply replicated in the afternoon, that is, having each bus travel the same route 
as in the morning but in the opposite direction. This is true since the sequencing 
of school ending times in the afternoon is different from the sequencing of school 
starting times; therefore, schools visited in one order in the morning cannot always 
be visited in the same or opposite order in the afternoon. 

For the morning problem in Manhattan, the specific problem parameters are 
given below. During the 1992-1993 academic year, 4,619 students were transported 
by school buses from 838 bus stops to 73 schools. The constraints were as follows. 

• Vehicle capacity constraint. At most 66 students can be on the bus at one 
time. 

• Distance constraint. Each student cannot be on the bus for more than 5 miles. 

• Time-window constraints. Buses must arrive at a school no earlier than 
25 min before and no later than 5 min before the start of school. 

• The earliest pickup must not be before 7:00 a.m. 

• Mixed loads are allowed. 

The 5-mile distance constraint is not applied uniformly to all students; students 
in District 6 (upper Manhattan) are often transported out of their district due to 
overcrowding. Therefore, since this involves longer trips, sometimes traversing most 
of the island, the 5-mile constraint is not applied to these students. Approximately 
36% of the students in our application were in this group. 

The Manhattan school bus routing problem presents many challenges. First of 
all, the number of bus stops and schools is much larger than those encountered in 
most vehicle routing applications. Second, there are many difficulties involved in 
calculating accurate distances and travel times in New York City. We now consider 
these two points. 



21.5 Distance and Time Estimation 

To accurately estimate distances, one needs a precise geographic representation of 
the area. This is achieved using a geographic information system (GIS,) which is 
based on data files built from satellite photographs. These files store geographic 
objects, such as streets, highways, parks, and rivers, that can be presented on a 
computer screen. An important feature is the ability to calculate the exact latitudes 
and longitudes of any point. When the GIS is given a street address, the process of 
geocoding returns the coordinates of the address with very high accuracy. Having 
these coordinates makes it easy to calculate “as the crow flies,” or “Euclidean,” 
distances. Some GISs also have the capability of calculating exact road network 
distances, that is, the distance between two points on the actual street network, 
sometimes even taking into account one-way streets. 
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The Office of Pupil Transportation at the Board of Education uses a GIS called 
Maplnfo for Windows. The Maplnfo version used by the City does not have a street 
network representation of New York City. However, such a network has been dev- 
eloped by a subcontractor; therefore, accurate shortest distances between any two 
points along the street network are readily available. The current version also takes 
into account one-way streets. Although incorporating one-way street information 
may seem like a trivial task, it turned out to be very difficult. We believe most 
current geographic information systems are highly inaccurate with regard to one- 
way streets and are probably unusable without substantial error checking. The 
New York City Department of Transportation does not keep the information in 
an easily retrievable format. We had to resort to checking the one-way street sign 
database at the NYC DOT to reconstruct accurate information about one-way 
streets. Inevitably, the data collection and error checking were extremely time- 
consuming. 

Estimating accurate travel times in New York City is probably the trickiest 
part of the problem. As described above, a GIS with a street network represen- 
tation simplifies the calculation of street distances. In addition, in the GIS, each 
data structure corresponding to a street segment has space to store the average 
travel speed and/or travel time along the segment. These estimates would make 
it possible to calculate travel times along any path. The difficulty lies, of course, 
in determining these travel speeds. 

Most existing vehicle routing implementations that we are aware of use a fixed 
travel speed throughout the area of interest. Travel times are then determined by 
simply dividing the distance traveled by this universal speed. This method is most 
likely not satisfactory for New York City. Anyone who has driven in New York 
City knows the multitude of different street types and congestion levels that can 
produce a wide variety of different travel speeds. We decided to try to get some 
idea of the average speed in different parts of New York City. 

In addition to performing various timing experiments, we obtained several 
reports from the New York City Department of Transportation. These include 
“Midtown Auto Speeds — Spring 1992” and “Midtown Auto Speeds — Fall 1992.” 
These reports provide data on Midtown Manhattan average travel speeds as well as 
some data on the variance of these speeds. (Midtown Manhattan is defined as the 
rectangular area between First and Eighth Avenues and 30th and 60th Streets.) 
The data seem to suggest that speeds vary from an average of 6 miles per hour up 
to about 14 miles per hour, depending on street type, direction, and time of day. 

Our approach was to choose an estimate of speed that would be specific to each 
district; thus, a district in the Bronx would not have the same speed estimate 
as one in Midtown Manhattan. These range from about 7 to 12 miles per hour. 
An important observation made when collecting data was that when a bus expe- 
rienced below-average travel times (above-average speeds) along the beginning of 
the route, the bus driver will slow down or spend more time at the stops to get 
back on schedule. In addition, since the students have a scheduled pickup time, 
the bus cannot, as a rule, leave early. It must wait until a specific time before leav- 
ing the bus stop. If the bus experiences above-average travel times (below-average 
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speeds), then the bus driver can speed up (slightly) and make sure to leave as 
soon as all students are on the bus. Consequently, the travel time is not quite as 
random as one might think. 

To make sure that school buses meet the time- window constraints, simply having 
information about travel time along the streets of New York City is not sufficient. 
The time to pick up students from their bus stops and to drop off students at 
their schools must also be taken into account. By riding the buses, we collected 
data on the time it takes to pick up or drop off students at stops or at schools. 
A linear regression was performed on the data, providing the following model for 
the pickup time: 

PTime = 19.0 + 2.67V , 

where PTime = pickup time (in s), and N = the number of students picked up 
at the bus stop. This regression was performed on 30 data points. The R 2 was 
77.7% and the p-value of the independent variable was very small (< 0.001). The 
regression performed on the dropoff times resulted in the equation 

DTime = 29.0 + 1.9 TV, 

where DTime = dropoff time (in s), and TV = the number of students dropped 
off at the school. This regression was performed on 30 data points. Here the R 2 
was 41.9% and the p- value of the independent variable was 0.01 %. In our imple- 
mentation, we used these equations to determine approximate pickup and dropoff 
times. 

Overall, the approximations and calculations made in testing the optimization 
module were designed with the goal of ensuring that a route constructed by the 
algorithm would be a driveable one. The next question is how to generate a good 
feasible solution to the school bus routing and scheduling problem. 



21.6 The Routing Algorithm 

There are many existing algorithms for school bus routing and scheduling. Numer- 
ous communities throughout the world have implemented computerized algorithms 
to perform these tasks. Overall, the success seems to be universally recognized. 
Almost all papers published in the literature mention cost savings of around 
5-10%. We recognize that it may be useless to even contemplate the meaning 
of these savings numbers since the savings may come not only from a reduction 
in cost but also from increased control of the bus routes. The magnitude of the 
“savings” is also highly dependent on what methods were in use before the com- 
puterized system was put into place. 

Transferability seems to be the critical factor. It is difficult to compare algo- 
rithms for this problem directly from the literature. Each problem has its own 
version of the constraints and even objectives. It is not always simple or even pos- 
sible to take an existing algorithm in use in one community and simply apply it 
to another. Each problem has its peculiarities and may also have very different 
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constraints. For instance, in an implementation in Montreal, the people designing 
the routes have the freedom to change existing school starting and ending times at 
their convenience. Clearly, this added flexibility can simplify the problem to some 
extent and can lead to additional savings in cost. In New York City, this was not 
possible. 

Finally, this is all within the framework of an optimization problem, which we 
have seen is extremely difficult to solve. There is an absence of any strong lower 
bounds on the minimum number of buses required. 

In determining what type of algorithm to apply to this large vehicle routing 
problem, we considered several important aspects of the problem and also the 
setting in which the algorithm would be used. 

Efficiency This is an extremely large problem, so the solution method must be 
efficient in computation time and in space requirement. While we might want 
to optimize by district, the fact is that some districts have as many as 1,500 
bus stops. Even though complete optimization of the solution might only be 
done once a year, the time involved in testing and experimenting with the 
problem parameters is reduced considerably if the algorithm is time- and 
space-efficient. 

Transparency The algorithm would most likely need to be constructive in na- 
ture, thereby providing a dispatcher with the ability of viewing the algorithm 
progression in real time. This makes it possible to detect “problem routes” 
and correct errors without having to wait until the termination of the algo- 
rithm. That is, the approach should build routes in a sequential fashion and 
not, for example, work for hours and finally, in the last moments, provide a 
solution. 

Flexibility The heuristic should be flexible enough to handle not only the con- 
straints currently in place, but also additional constraints that might be 
imposed in the future. 

Interactivity From our discussions with the inspectors, it is clear that the algo- 
rithm implemented must have an interactive component that would allow an 
experienced inspector to help construct routes using his or her prior knowl- 
edge. That is, the algorithm must be able to work in two different modes. 
First, it must be able to act like a black box, where data are input and a 
solution is output. Second, it must also serve as an interactive tool, where 
a starting solution can be presented along with a set of unrouted stops and 
the algorithm finds the best way to add on to this starting solution. 

Multiple solutions The algorithm should be capable of producing a series of 
solutions, not simply one solution. This last point is important since each 
solution would have to be checked by an inspector, and it is possible that 
the inspector will rule out some solutions. 

Finally, the urban nature of our application, in contrast to many of the problems 
seen in the literature, should also be taken into account. As many researchers have 
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noted [see Bodin and Berman (1979) and Chapleau et al. (1983)], the vehicle 
capacity constraint tends to be the most binding constraint when routing in an 
urban area. This is due to the general rule that the bus will tend to “fill up” before 
the time constraints become an issue. Therefore, it seems as though algorithms 
developed for the capacitated vehicle routing problem (CVRP) (see Chap. 17) 
should be a good starting point. The difficulty is that the CVRP generally has a 
different objective function: Minimize the total distance traveled, not the number 
of vehicles used. Fortunately [see Chap. 17 or Bramel et al. (1991)], if the number of 
pickup points is very large and distances follow a general norm, when the distance is 
minimized, a byproduct of the solution is that the minimum number of vehicles will 
be used. Observe that distances in New York City come from the street network, 
not from a norm; however, since the blocks are short and somewhat uniform in 
size, the street network distance is fairly close to a norm distance, and similar 
results most likely hold. 

For these reasons, our starting point for the algorithm for the school bus routing 
and scheduling problem was the location-based heuristic (LBH) (see Sect. 17.7) 
developed for the CVRP. This algorithm has the important property that it is 
asymptotically optimal for the CVRP (see Sect. 17.7); that is, the relative error 
between the value of the solution generated by the algorithm and the optimal 
solution value tends to zero as the number of pickup points increases. 

Due to the size and complexity of the problem, we made several changes to the 
LBH. The algorithm is serial in nature, as it constructs one route at a time and 

not in parallel. To describe the algorithm, let the bus stops be indexed 1, 2, , n. 

Let a route run by a single bus be denoted Ri . Let a full solution to the school bus 
routing and scheduling problem be written as a set of routes • • • ,Rm}, 

where M is the number of buses used. For each bus stop j, let school [j] be the 
index of the school to which the students at stop j are destined. Let U be the set 
of indices of all unvisited pickup points. 

The following algorithm creates one solution to the school bus routing and 
scheduling problem. More solutions can be generated by starting the algorithm 
with different random seeds. 

Randomized LBH: 

Let U = { 1 , 2 , . . . , n} and m = 0. 
while (U 7 ^ 0) do 

{ 

Pick a seed stop from U using a selection criterion. Call it j. 

Let U ^ U\{j}. 

Let the current route be i? m = {j — >> school[j]}. 

repeat 

{ 

For each i E V, calculate c* =routelength{i , R m )- 

Let Ck = m m ieU {ci}. 

If Ck < +oo, then 
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{ 

Let R m <— buildroute(k, R m ). 

Let U <— U \ {k}. 

} 

} until Ck = +oc. 

m <— m + 1. 

} 

M <— m. 

The heuristic solution is {i?i, i? 2 , . . . , ii/vf}- 

The selection of the seed stops can be done in one of several different ways. 
One approach is to simply select these stops at random from the set of unvisited 
stops. Another approach is to select stops with large loads or stops that have 
tight delivery windows (i.e., the distance and time constraints force these stops 
to be delivered almost directly from the stop to the school with very few stops in 
between) . Other criteria were used according to which constraints were binding at 
particular stops. 

The function routelength(i , R) determines the approximate cost of inserting stop 
i into route R. Route R consists of a path through several stops and schools. While 
preserving the order of the stops and schools in route R , we determine the best 
insertion point for stop i. We check each consecutive pair of points (either stops 
or schools) along route R and check whether stop i can be inserted between these 
two. If school[i] is not in route i?, then we must find not only the best insertion 
point for stop i, but also the best insertion point for school[i\. It is possible that no 
insertion point (s) can be found that results in a feasible route. Checking whether 
a stop can be inserted requires checking all the constraints. If no feasible insertion 
point exists, then the value of routelength{i , R) is made +oo. This indicates that it 
is not possible (while preserving the order of R) to insert stop i into route R. If an 
insertion is found that results in a feasible route, then the value of routelength(i , R) 
is made to be exactly the additional distance traveled. 

To illustrate the difficulty of this step, consider simply the capacity constraint. 
In the case of the CVRP, all loads are dropped off at the same point (the final 
stop); therefore, the maximum load that is carried by the vehicle is when it picks 
up its last load. Therefore, it is easy to check whether a stop can be added to a 
route since we need only check that the maximum load is less than the vehicle 
capacity. This maximum load is always at the last stop, so the calculation is 
easy. By contrast, performing a similar calculation in the school bus routing and 
scheduling problem is much more complicated since there is more than one dropoff 
point. Checking feasibility when adding a stop to a route requires knowing when 
the student is getting on and off the bus, since this will affect whether there is 
room for a student at future points on the bus route. Therefore, checking whether 
the capacity constraint is violated in the school bus routing problem is much more 
complicated than in the CVRP. 

The function buildroute(k , R) creates the route that results from the insertion of 
stop k into route R. Again, stop k is simply inserted between the two consecutive 
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points (stops or schools) that result in the shortest total route. This route is 
guaranteed to be feasible since Ck < +oo. 

The algorithm satisfies the requirements that we described above. It runs effi- 
ciently for problems of a large size and builds routes sequentially. It is very flexible 
in the sense that constraints of almost any type can be included (e.g., disallowing 
mixed loads for some schools). Of course, each additional constraint causes the 
algorithm to take a little longer to find a solution. In terms of its interactivity (see 
the next section for details), the algorithm can be used in an interactive mode if 
this is desired. In this mode, a partial routing solution can be used as a starting 
point and unrouted stops can be added efficiently. The inspector can also have a 
major impact on the routes generated by the algorithm via the selection of the 
seed points (see Sect. 21.8 below for a further discussion on this point). Since the 
algorithm can be easily randomized (by randomizing the seed stop selection pro- 
cedure), starting the algorithm with different random numbers makes it generate 
different solutions. Finally, the most important advantage of this heuristic is that 
it does not decompose the problem into subproblems, but solves the routing and 
scheduling components simultaneously. 



21.7 Additional Constraints and Features 

In the course of the implementation of our algorithm, several additional “soft” 
constraints came to our attention. These are subtle rules that inspectors used when 
constructing feasible routes, which were only determined once a set of routes were 
shown to the inspectors. 

Limit on the number of buses to a particular school This is best expla- 
ined with an example. Consider the situation where a school, say school 
A, has a late starting time relative to other schools, say 9:30 a.m., where 
all other schools start at 9 a.m., and assume only a dozen of the students 
from school A require bus service. Previously, if a solution required 20 buses 
to serve all schools, routers would take one of these and have it alone serve 
school A. That is, some time between 9 and 9:30 a.m., one bus would pick 
up the dozen students and deliver them to school A. Since 20 buses are used 
in the solution, this solution is equivalent to, for example, having 6 of the 
20 buses each deliver 2 students to school A between 9 and 9:30 a.m. This, 
from a cost point of view, is just as good a solution. However, school A may 
only be able to handle one or two buses at a time due to limited driveway 
space. We therefore needed to add a constraint on the number of buses that 
could deliver students to each school. This constraint only became active for 
a few schools. 

Multilevel relational distance constraints When a driver is delivering pack- 
ages to warehouses or to customers, a distance constraint is usually set on 
the complete route and thus is limited to the driver’s working day. When a 
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bus driver is delivering students to schools, the distance constraint is really 
student- specific. That is, each student’s trip is limited, not just the driver’s. 
In the school bus routing and scheduling problem, the distance constraint 
also illustrates the difficulty of modeling, through simple constraints, a real- 
life problem. To illustrate this, consider the 5-mile distance constraint discussed 
earlier. We found that this simple constraint was actually unsatisfactory for 
this problem. For example, if a student was only lmile from school, then 
it was not considered desirable to have him or her end up traveling 5 miles 
on the bus. This student (and maybe more vociferously his or her parents) 
would not consider this an equitable solution. We therefore decided to imple- 
ment what we call a relational distance constraint. That is, for a multiplier 
<a, say a = 2, a student could not travel on the bus for more than a times 
the distance the student’s bus stop was from school. The question was then 
to what do we set a? We determined that the best rule was to divide the 
region around a particular school into concentric rings. For example, if the 
first ring was 3 miles in radius, then a stop that was d < 3 miles from the 
school would have a distance constraint (on the bus) of aq d miles. Ring i 
was assigned a multiplier a^, and this was repeated for each ring. Although 
it took some time to determine appropriate multipliers, eventually this is the 
type of distance constraint that was implemented. 

Waiting-time constraint Another constraint that did not come to our atten- 
tion until we presented our routes to the inspectors was the waiting-time 
constraint. Again, this is something that is specific to the routing of people 
as opposed to packages. Consider a simple problem with two schools, school 
A starting at 8 a.m. and school B starting at 9 a.m. At 7:30 a bus picks 
up both students for schools A and B and then arrives at school A in the 
time window (say at 7:45) and drops off only those students who are going 
to school A. Since school B starts at 9 a.m., the bus waits for half an hour 
at school A until proceeding to pick up some more students for school B and 
then arriving at school B at 8:45 and dropping off all the students. A route 
of this type, where students wait on the bus for half an hour, was definitely 
not deemed acceptable. Therefore, we needed to add a constraint on the 
amount of time a bus (with students on it) can wait idle. Five minutes was 
the number that was eventually used. 

Route balancing It is desirable that the routes in a solution be of similar 
duration and total distance. It does not seem fair if one driver serves morn- 
ing routes from 7 to 7:30 a.m. while another works from 7 to 9:30 a.m. The 
balancing of the workloads is partially achieved by implementation of a route- 
balance () subroutine that is called once, at the end of the algorithm. This 
subroutine essentially moves stops and schools from heavily loaded routes to 
less heavily loaded routes while maintaining feasibility of the solution. This 
seemed to work very well. 
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Single-route optimization Once a solution is determined, we may (and should) 
optimize the sequencing of the stops and schools on each route individually. 
That is, given a set of stops and schools that can be feasibly served by one 
bus, in terms of service level, what is the “best” route to actually drive? 
An objective that guarantees a high service level is to minimize the total 
number of student-miles traveled (see, e.g., Bennett and Gazis (1972)). For 
each route created, we call a procedure called route- opt (), which minimizes 
the total number of student-miles while maintaining feasibility of the route. 



21.8 The Interactive Mode 

As we mentioned earlier, the complete rescheduling of all buses might only be done 
once a year (in August). However, throughout the course of the school year, there 
are quite a few small changes that must be made to the solution. These changes 
could be caused by, for example: 

• A school that previously did not request bus service does request service in 
mid-year. 

• A student changes address or school. 

• A school’s session time changes. 

One option might be simply to reoptimize all routes that are affected by the 
changes. This might cause major disruptions in a large number of routes. These 
disruptions may translate to disruptions in the parents’ morning schedules, which 
might overload the Office of Pupil Transportation telephone switchboard. 
In essence, it is desirable to implement the changes while making the fewest dis- 
ruptions to other students’ schedules. 

This was the impetus for the development of the algorithm’s interactive mode. 
Here it is possible to start the algorithm with a number of routes already created 
and to simply add stops to or delete stops from these routes. Let’s consider what 
happens when a stop is added to an existing set of routes. The user has the ability 
to select from one of three options: 

• Complete reoptimization. This corresponds to starting the reoptimization 
from scratch, that is, throwing away all previously created routes. Optimiza- 
tion then starts with all new stops added to the list of stops. 

• Single-route reoptimization. This corresponds to selecting a route and check- 
ing whether a particular stop can be added to it. This is done through a 
simple route-check() subroutine. In this case, the route may be completely 
resequenced. 

• No reoptimization. In this case, the stop is simply inserted between two stops 
on existing routes without any reoptimization. 
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Deleting a stop is somewhat easier to do; the user simply clicks the mouse on 
the stop in question and deletes it from the current solution. The fact that this 
may actually render the remaining route infeasible is a good illustration of the 
complexity of the bus routing and scheduling problem. This is due to the waiting- 
time constraint mentioned in the previous section. In either case, the user can 
specify whether a reoptimization of the route is desired. 

These optimization tools proved quite useful as they provided simple ways to 
test what-if scenarios, tests that previously would have taken weeks, if not months. 



21.9 Data, Implementation, and Results 

To assess the effectiveness of our algorithm, we attempted to solve the problem 
using the Manhattan data given to us by the Office of Pupil Transportation, that 
is, to use our algorithm to generate a solution and to check it for actual drivability. 

We solved both the morning and the afternoon problem. We first calculated 
the shortest-distance matrix between all 911 points of interest (838 stops and 73 
schools) along the street network. In our implementation, we used a speed of 8 miles 
per hour for the entire borough. This was the lowest average speed in Midtown 
Manhattan along a street or avenue between 7 and 10 a.m. (the time interval 
that the bus would be traveling in the morning) reported by the Department of 
Transportation. We feel that this average speed is quite conservative and that, on 
average, a bus can travel more quickly. One reason for this is that the measurement 
was made in Midtown Manhattan, a location with very high congestion throughout 
the day. 

The algorithm was run on a PC (486DX2/50 megahertz) under Windows over 
a period of several hours. To generate its first feasible solution, the algorithm 
takes about 40 min. We repeated the algorithm 40 times, keeping track of the best 
solution. The algorithm has as output a detailed schedule and directions for each 
bus. 

In order to determine the sensitivity of the results to some of the assumptions we 
have made, we ran the algorithm with several settings for the average travel speed. 
We used 8, 10, and 12mph. Note again that these speeds are conservative, as we 
have also taken into account the time to stop and pick up or drop off students. 
The following table lists the number of buses used in the best solutions found for 
each of these settings and for the morning and afternoon problems (Table 21.1). 



TABLE 21.1. General Education routing 



Universal 
speed (mph) 


Number of buses used 


Morning 


Afternoon 


8 


74 


67 


10 


64 


60 


12 


59 


56 
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As a comparison, these solutions use substantially fewer buses than are currently 
in use. We do not expect that the number of buses used will be as low as indicated 
by our preliminary results, due to the fact that the routes have not been checked 
by the inspectors. However, it is reasonable to assume that they will serve as a 
starting solution that can be modified by the inspectors. 



References 



Aggarwal, A., & Park, J. K. (1993). Improved algorithms for economic lot-size 
problems. Operation Research, J±l, 549-571. 

Agrawal, V., & Seshadri, S. (2000). Impact of Uncertainty and Risk Aversion 
on Price and Order Quantity in the Newsvendor Problem. Manufacturing and 
Service Operations Management, 2(A), 410-423. 

Aho, A. V., Hopcroft, J. E., & Ullman, J. D. (1974). The design and analysis of 
computer algorithms. Reading, MA: Addison- Wesley. 

Altinkemer, K., & Gavish, B. (1987). Heuristics for unequal weight delivery prob- 
lems with a fixed error guarantee. Operation Research Letter, 6, 149-158. 

Altinkemer, K., & Gavish, B. (1990). Heuristics for delivery problems with con- 
stant error guarantees. Transportation Science, 24 , 294-297. 

Angel, R. D., Caudle, W. L., Noonan, R., & Whinston, A. (1972). Computer- 
assisted school bus scheduling. Management Science, 18, 279-288. 

Anily, S. (1991). Multi- item replenishment and storage problems (MIRSP): heuris- 
tics and bounds. Operation Research, 39, 233-239. 

Anily, S., Bramel, J., & Simchi-Levi, D. (1994). Worst-case analysis of heuristics 
for the bin-packing problem with general cost structures. Operation Research, 
42, 287-298. 



D. Simchi-Levi et al., The Logic of Logistics: Theory, Algorithms, and Applications 421 

for Logistics Management , Springer Series in Operations Research and Financial Engineering, 
DOI 10.1007/978-1-4614-9149-1, © Springer Science+Business Media New York 2014 



422 References 



Archibald, B., & Silver, E. A. (1978). (s, S) policies under continuous review and 
discrete compound Poisson demand. Management Science, 24 , 899-908. 

Arkin, E., Joneja, D., & Roundy, R. (1989). Computational complexity of uncapac- 
itated multi-echelon production planning problems. Operation Research Letter, 
8, , 61-66. 

Arrow, K., Harris, T., & Marschak , J. (1951). Optimal inventory policy. Econo- 
metrica, 19 , 250-272. 

Atkins, D. R., & Iyogun, P. (1988). A heuristic with lower bound performance 
guarantee for the multi-product dynamic lot-size problem. HE Transactions, 
20 , 369-373. 

Azuma, K. (1967). Weighted sums of certain dependent random variables. Tohoku 
Mathematical Journal, 19 , 357-367. 

Baker, B. S. (1985). A new proof for the first-fit decreasing bin packing algorithm. 
Journal of Algorithms, 6 , 49-70. 

Baker, K. R., Dixon, P., Magazine, M. J., & Silver, E. A. (1978). An algorithm for 
the dynamic lot-size problem with time- varying production capacity constraints. 
Management Science, 24 , 1710-1720. 

Balakrishnan, A., & Graves, S. (1989). A composite algorithm for a concave-cost 
network flow problem. Networks, 19 , 175-202. 

Balinski, M. L. (1965). Integer programming: methods, uses, computation. Man- 
agement Science, 12 , 253-313. 

Balinski, M. L., & Quandt, R. E. (1964). On an integer program for a delivery 
problem. Operation Research, 12 , 300-304. 

Ball, M. O., Magnanti, T. L., Monma, C. L., & Nemhauser, G. L. (Eds.) (1995). 
Network routing, Handbooks in operations research and management science. 
Amsterdam: North-Holland. 

Ballou, R. H. (1992). Business logistics management (3rd ed.). Englewood Cliffs, 
NJ: Prentice-Hall. 

Barcelo, J., & Casanovas, J. (1984). A heuristic Lagrangian algorithm for the 
capacitated plant location problem. European Journal of Operational Research, 
15 , 212-226. 

Basar, T., & Olsder, G. J. (1999). Dynamic noncooperative game theory (Clas- 
sics in applied mathematics) . Philadelphia: Society for Industrial and Applied 
Mathematics. 

Beasley, J. (1983). Route first-cluster second methods for vehicle routing. Omega, 
11 , 403-408. 



References 423 



Beardwood, J., Halton, J. L., & Hammersley, J. M. (1959). The shortest path 
through many points. Proceedings of the Cambridge Philosophical Society, 55, 
299-327. 

Bell, C. (1970). Improved algorithms for inventory and replacement stock prob- 
lems. SIAM Journal on Applied Mathematics, 18, 558-566. 

Bennett, B., & Gazis, D. (1972). School bus routing by computer. Transportation 
Research, 6, 317-326. 

Bertsekas, D. P. (1987). Dynamic programming. Englewood Cliffs, NJ: Prentice- 

Hall. 

Bertsekas, D. P. (1995). Nonlinear programming. Boston: Athena Scientific. 

Bernstein, F., & Federgruen, A. (2004). Dynamic inventory and pricing models for 
competing retailers. Naval Research Logistics, 51, 258-274. 

Bertsimas, D., & Simchi-Levi, D. (1996). The new generation of vehicle routing 
research: robust algorithms addressing uncertainty. Operation Research, 44 , 286- 
304. 

Bienstock, D., & Simchi-Levi, D. (1993). A note on the prize collecting traveling 
salesman problem. Working Paper, Columbia University. 

Bienstock, D., Bramel, J., & Simchi-Levi, D. (1993a). A probabilistic analysis of 
tour partitioning heuristics for the capacitated vehicle routing problem with 
unsplit demands. Mathematics of Operations Research, 18, 786-802. 

Bienstock, D., Goemans, M., Simchi-Levi, D., & Williamson, D. (1993b). A note 
on the prize collecting traveling salesman problem. Mathematical Programming, 
59, 413-420. 

Bodin, L., & Berman, L. (1979). Routing and scheduling of school buses by com- 
puter. Transportation Science, 13, 113-129. 

Bondareva, O. (1963). Some applications of linear programming methods to the 
theory of cooperative games (in Russian). Problemy Kybernetiki, 10, 119-139. 

Bourakiz, M., & Sobel, M. J. (1992). Inventory control with an exponential utility 
criterion. Operation Research, 40, 603-608. 

Braca, J., Bramel, J., Posner, B., & Simchi-Levi, D. (1997). A computerized ap- 
proach to the New York City school bus routing problem. HE Transactions, 29, 
693-702. 

Bramel, J., & Simchi-Levi, D. (1995). A location based heuristic for general routing 
problems. Operation Research, 43, 649-660. 



424 References 



Bramel, J., & Simchi-Levi, D. (1996). Probabilistic analysis and practical algo- 
rithms for the vehicle routing problem with time windows. Operation Research, 
44, 501-509. 

Bramel, J., & Simchi-Levi, D. (1997). On the effectiveness of set covering formu- 
lations for the vehicle routing problem. Operation Research, 45, 295-301. 

Bramel, J., Coffman Jr., E. G., Shor, P., & Simchi-Levi, D. (1991). Probabilistic 
analysis of algorithms for the capacitated vehicle routing problem with unsplit 
demands. Operation Research, 40, 1095-1106. 

Cachon, G. (2003). Supply chain coordination with contracts. In S. Graves, T. de 
Kok Supply chain management: design, coordination and operation (Handbook of 
operations research and management science (Vol. 11, pp. 229-340). Amsterdam: 
Elsevier. 

Cachon, G., & Lariviere, M. (2005). Supply chain coordination with revenue shar- 
ing contracts: strengths and limitations. Management Science, 51, 30-44. 

Cachon, G., & Netessine, S. (2004). Game theory in supply chain analysis. In 
D. Simchi-Levi, S. D. Wu, Z. J. Max Shen (Eds.) Handbook of quantitative supply 
chain analysis: modeling in the eBusiness era. Boston: Kluwer Academic. 

Chapleau, L., Ferland, J. A., & Rousseau, J.-M. (1983). Clustering for routing in 
dense area. European Journal of Operational Research, 20, 48-57. 

Chan, L. M. A., Simchi-Levi, D., & Bramel, J. (1998). Worst-case analyses, lin- 
ear programming and the bin-packing problem. Mathematical Programming, 83, 
213-227. 

Chandra, B., Karloff, H., & Tovey, C. (1999). New results on the old k- opt al- 
gorithm for the traveling salesman problem. SIAM Journal on Computing, 28, 
1998-2029. 

Chan, L. M. A., Muriel, A., & Simchi-Levi, D. (1999). Production/distribution 
planning problems with piece- wise linear and concave cost structures. North- 
western University. 

Chan, L. M. A., Max Shen, Z. J., Simchi-Levi, D., & Swann, J. (2004). Coor- 
dination of pricing and inventory (Chapter 3) In D. Simchi-Levi, S. D. Wu, 
Z. J. Max Shen (Eds.) Handbook of quantitative supply chain analysis: modeling 
in the eBusiness era. Boston: Kluwer Academic. 

Chen, Y. F. (1996). On the optimality of (s, S) policies for quasiconvex loss func- 
tions. Working Paper, Northwestern University. 

Chen, X. (2003). Coordinating inventory control and pricing strategies (Ph.D. Dis- 
sertation, Massachusetts Institute of Technology) 



References 425 



Chen, X. (2009), Inventory centralization games with price-dependent demand and 
quantity discount. Operation Research , 57 , 1394-1406. 

Chen, F., & Federgruen, A. (2000). Mean- variance analysis of basic inventory 
models. Working paper, Columbia University. 

Chen, X., & Hu, P. (2012). Joint pricing and inventory management with deter- 
ministic demand and costly price adjustment. Operation Research Letter, 40, 
385-389. 

Chen, X., & Simchi-Levi, D. (2004a). Coordinating inventory control and pricing 
strategies with random demand and fixed ordering cost: the finite horizon case. 
Operation Research, 52, 887-896. 

Chen, X., & Simchi-Levi, D. (2004b). Coordinating inventory control and pricing 
strategies with random demand and fixed ordering cost: the infinite horizon case. 
Mathematics of Operations Research, 29, 698-723. 

Chen, X., & Simchi-Levi, D. (2009). A new approach for the stochastic cash bal- 
ance problem with fixed costs. Probability in the Engineering and Informational 
Sciences, 23, 545-562. 

Chen, X., & Simchi-Levi, D. (2012). Pricing and inventory management. P. Philips, 
O. Ozer (Eds.) Oxford handbook of pricing management (pp. 784-822). Oxford: 
Oxford University Press. 

Chen, X., & Sun, P. (2012). Optimal structural policies for ambiguity and risk 
averse inventory and pricing models. SIAM Journal on Control and Optimiza- 
tion, 50, 133-146. 

Chen, X., & Zhang, J. (2009). A stochastic programming approach to inventory 
centralization games. Operation Research, 57, 840-851. 

Chen, F., & Zheng, Y. S. (1994). Lower bounds for multi-echelon stochastic inven- 
tory systems. Management Science, 40, 1426-1443. 

Chen, X., Sim, M., Simchi-Levi, D., & Sun, P. (2007). Risk aversion in inventory 
management. Operation Research, 55, 828-842. 

Chen, X., Zhang, Y., & Zhou, S. (2010). Integration of inventory and pricing 
decisions with costly price adjustments. Operation Research, 58, 1012-1016. 

Chen, X., Zhou, S., & Chen, F. (2011). Preservation of quasi-K-concavity and 
its application to joint inventory-pricing models with concave ordering costs. 
Operation Research, 58, 1012-1016. 

Chen, X., Pang, Z., & Pan, L. (2012a). Coordinating inventory control and pric- 
ing strategies for perishable products. Working Paper, University of Illinois at 
U r bana- C hampaign . 



426 References 



Chen, X., Hu, P., & He, S. (2012b). Preservation of supermodularity in two di- 
mensional parametric optimization problems and its applications. This paper 
has been accepted by Operations Research. 

Chen, X., Hu, P., Shum, S., & Zhang, Y. (2012c). Stochastic inventory model with 
reference price effects. Working Paper. 

Chou, M., Teo, C., & Zheng, H. (2008). Process flexibility: design, evaluation, and 
applications. Flexible Services and Manufacturing J., 20( 1), 59-94. 

Chou, M., Chua, G., Teo, C., & Zheng, H. (2010). Design for process flexibility: 
efficiency of the long chain and sparse structure. Operation Research, 58 , 43-58. 

Chou, M., Chua, G., Teo, C., & Zheng, H. (2011). Processs flexibility revisited: 
the graph expander and its applications. Operation Research, 59 , 1090-1105. 

Chou, M., Chua, G., Teo, C., & Zheng, H. (2012). On the performance of sparse 
process structures in partial postponement production systems. Working Paper. 

Christofides, N. (1976). Worst-case analysis of a new heuristic for the traveling 
salesman problem. Report 388, Graduate School of Industrial Administration, 
Carnegie-Mellon University, Pittsburgh, PA. 

Christofides, N. (1985). Vehicle routing. In E. L. Lawler, J. K. Lenstra, A. H. G. 
Rinnooy Kan, D. B. Shmoys (Eds.) The traveling salesman problem: a guided 
tour of combinatorial optimization (pp. 431-448). New York: Wiley. 

Christofides, N., Mingozzi, A., & Toth, P. (1978). The vehicle routing problem. In 
N. Christofides, A. Mingozzi, P. Toth, C. Sandi (Eds.) Combinatorial optimiza- 
tion (pp. 318-338). New York: Wiley. 

Christofides, N., Mingozzi, A., & Toth, P. (1981). Exact algorithms for the ve- 
hicle routing problem based on spanning tree and shortest path relaxations. 
Mathematical Programming, 20 , 255-282. 

Churchman, C. W., Ackoff, R. L., & Arnoff, E. L. (1957). Introduction to operations 
research. New York: Wiley. 

Clark, A. J., & Scarf, H. E. (1960). Optimal policies for a multi-echelon inventory 
problem. Management Science, 6 , 475-490. 

Clarke, G., & Wright, J. W. (1964). Scheduling of vehicles from a central depot to 
a number of delivery points. Operation Research, 12, 568-581. 

Coffman, E. G. Jr., & Lueker, G. S. (1991). Probabilistic analysis of packing and 
partitioning algorithms. New York: Wiley. 

Cornuejols, G., & Harche, F. (1993). Polyhedral study of the capacitated vehicle 
routing problem. Mathematical Programming, 60, 21-52. 



References 427 



Cornuejols, G., Fisher, M. L., & Nemhauser, G. L. (1977). Location of bank ac- 
counts to optimize float: an analytical study of exact and approximate algo- 
rithms. Management Science, 23 , 789-810. 

Council of Supply Chain Management Professionals: http://www.cscmp.org/. 

Council on Logistics Management, mission statement, Council on Logistics Man- 
agement Web Site: www.clml.org/mission.html. 

Croxton K. L., Gendron, B., Magnanti, T. L. (2003). A comparison of mixed- 
integer programming models for non-convex piecewise linear cost minimization 
problems. Management Science, 40, 1268-1273. 

Cullen, F., Jarvis, J., & Ratliff, D. (1981). Set partitioning based heuristics for 
interactive routing. Networks, 11, 125-144. 

Daskin, M. S. (1995). Network and discrete location: models algorithms and appli- 
cations. New York: Wiley. 

De Kok, A. G., & Graves, S. C. (Eds.) (2003). Supply chain management: design, 
coordination and operations, Handbooks in operations research and management 
science. Amsterdam: North-Holland. 

Dematteis, J. J. (1968). An economic lot sizing technique: the part-period algo- 
rithm. IBM Systems Journal, 7, 30-38. 

Denardo, E. V. (1996). Dynamic programming. In Avriel, M., Golany, B. (Eds.), 
Mathematical programming for industrial engineers (pp. 307-384). Englewood 
Cliffs, NJ: Marcel Dekker. 

Deng, Q., & Simchi-Levi, D. (1992). Valid inequalities, facets and computa- 
tional results for the capacitated concentrator location problem. Working Paper, 
Columbia University. 

Deng, S., & Yano, C. (2006). Joint production and pricing decisions with setup 
costs and capacity constraints. Management Science, 52, 741-756. 

Desrosiers, J., Ferland, J. A., Rousseau, J.-M., Lapalme, G., & Chapleau, L. 
(1986). TRANSCOL: A multi-period school bus routing and scheduling system. 
TIMS Studies in the Management Sciences, 22, 47-71. 

Desrochers, M., Desrosiers, J., & Solomon, M. (1992). A new optimization algo- 
rithm for the vehicle routing problem with time windows. Operation Research, 
40, 342-354. 

Dobson, G. (1987). The economic lot scheduling problem: a resolution of feasibility 
using time varying lot sizes. Operation Research, 35, 764-771. 

Dreyfus, S. E., & Law, A. M. (1977). The art and theory of dynamic programming. 
New York: Academic Press. 



428 References 



Edmonds, J. (1965). Maximum matching and a polyhedron with 0,1- vertices. Jour- 
nal of Research of the National Bureau of Standards B , 69B , 125-130. 

Edmonds, J. (1971). Matroids and the greedy algorithm. Mathematical Program- 
ming, 1 , 127-136. 

Eeckhoudt, L., Gollier, C., & Schlesinger, H. (1995). The risk-averse (and prudent) 
newsboy. Management Science, ^1(5), 786-794. 

Eliashberg, J., & Steinberg, R. (1991). Marketing-production joint decision mak- 
ing. In J. Eliashberg, J. D. Lilien (Eds.) Management science in marketing , Vol. 
5 of Handbooks in Operations Research and Management Science. Amsterdam: 
North-Holland. 

Elmaghraby, W., & Keskinocak, P. (2003). Dynamic pricing in the presence of 
inventory considerations: research overview, current practices, and future direc- 
tions. Management Science, 49, 1287-1309. 

Eppen, G., & Schrage, L. (1981). Centralized ordering policies in a multiwarehouse 
system with lead times and random demand. In L. Schwarz (Ed.) Multi-level 
production/inventory control systems: theory and practice. Amsterdam: North- 
Holland. 

Erlenkotter, D. (1990). Ford Whitman Harris and the economic order quantity 
model. Operation Research, 38, 937-946. 

Federgruen, A., & Heching, A. (1999). Combined pricing and inventory control 
under uncertainty. Operation Research, 4% 3), 454-475. 

Federgruen, A., & van Ryzin, G. (1997). Probabilistic analysis of a generalized bin 
packing problem with applications to vehicle routing and scheduling problems. 
Operation Research, ^5, 596-609. 

Federgruen, A., & Simchi-Levi, D. (1995). Analytical Analysis of Vehicle Routing 
and Inventory Routing problems. In M. O. Ball, T. L. Magnanti, C. L. Monma, 
G. L. Nemhauser (Eds.) Handbooks in operations research and management sci- 
ence , the volume on Network routing (pp. 297-373). Amsterdam: North-Holland. 

Federgruen, A., & Tzur, M. (1991). A simple forward algorithm to solve general 
dynamic lot sizing models with n periods in 0(n log n) or 0(n) time. Manage- 
ment Science, 37, 909-925. 

Federgruen, A., & Zipkin, P. (1984a). Approximation of dynamic, multi- location 
production and inventory problems. Management Science, 30, 69-84. 

Federgruen, A., & Zipkin, P. (1984b). Computational issues in the infinite horizon, 
multi-echelon inventory model. Operation Research, 32, 818-836. 



References 429 



Federgruen, A., & Zipkin, P. (1984c). Allocation policies and cost approximation 
for multi-location inventory systems. Naval Research Logistic Quarterly, 31 , 97- 
131. 

Few, L. (1955). The shortest path and the shortest road through n points. Math- 
ematika, 2 , 141-144. 

Fisher, M. L. (1980). Worst-case analysis of algorithms. Management Science, 26 , 

1-17. 

Fisher, M. L. (1981). The lagrangian relaxation method for solving integer pro- 
gramming problems. Management Science, 21 , 1-18. 

Fisher, M. L. (1994). Optimal solution of vehicle routing problems using minimum 
K-trees. Operation Research, 42 , 626-642. 

Fisher M. L. (1995). Vehicle routing. In M. O. Ball, T. L. Magnanti, C. L. Monma, 
G. L. Nemhauser (Eds.) Handbooks in operations research and management sci- 
ence, , the volume on Network routing (pp. 1-33) Amsterdam: North-Holland. 

Fisher, M. L., & Jaikumar, R. (1981). A generalized assignment heuristic for vehicle 
routing. Networks, 11, 109-124. 

Florian, M., & Klein, M. (1971). Deterministic production planning with concave 
costs and capacity constraints. Management Science, 18, 12-20. 

Florian, M., Lenstra, J. K., & Rinnooy Kan, A. H. G. (1980). Deterministic produc- 
tion planning: algorithms and complexity. Management Science, 26, 669-679. 

Fudenberg, D., & Tirole, J. (1991). Game theory. Cambridge, MA: MIT Press. 

Gale, D., & Politof, T. (1981). Substitutes and complements in network flow prob- 
lems. Discrete Applied Mathematics, 3, 175-186. 

Gallego, G., & van Ryzin, G. (1994). Optimal dynamic pricing of inventories with 
stochastic demand over finite horizons. Management Science, 40, 999-1020. 

Gallego, G., Queyranne, M., & Simchi-Levi, D. (1996). Single resource multi- item 
inventory system. Operation Research, 44 , 580-595. 

Garey, M. R., & Johnson, D. S. (1979). Computers and intractability. New York: 
W. H. Freeman and Company. 

Garey, M. R., Graham, R. L., Johnson, D. S., & Yao, A. C. (1976). Resource 
constrained scheduling as generalized bin packing. Journal of Combinatorial 
Theory, Series A, 21, 257-298. 

Gaskel, T. J. (1967). Bases for vehicle fleet scheduling. Operational Research Quar- 
terly, 18, 281-295. 



430 References 



Geunes, J., Romeijn, E., & Taaffe, K. (2006). Requirements planning with pricing 
and order selection flexibility. Operation Research, 5 394-401. 

Ghosh, A. (1994). Retail management (2nd ed.). New York, NY: Dryden Press 
Harcourt Brace College Publishers. 

Gillett, B. E., & Miller, L. R. (1974). A heuristic algorithm for the vehicle dispatch 
problem. Operation Research, 22 , 340-349. 

Goemans M. X., & Bertsimas, D. J. (1993). Survivable networks, linear program- 
ming relaxations and the parsimonious property. Mathematical Programming, 
60 , 145-166. 

Golden, B. L., & Stewart, W. R. (1985). Empirical analysis of heuristics. In E. L. 
Lawler, J. K. Lenstra, A. H. G. Rinnooy Kan, D. B. Shmoys (Eds.) The traveling 
salesman problem: a guided tour of combinatorial optimization (pp. 207-249). 
New York: Wiley. 

Goyal, S. K. (1978). A note on “multi-product inventory situation with one re- 
striction.” Journal of the Operational Research Society, 29 , 269-271. 

Graves, S. C. (2008). Flexibility principles. In Building intuition: insights from 
basic operations management models and principles (Chapter 3, pp. 33-51) New 
York: Springer. 

Graves, S. C., & Schwarz, L. B. (1977). Single cycle continuous review policies for 
arborescent production/inventory systems. Management Science, 23 , 529-540. 

Graves, S. C., & Willems, S. P. (2000). Optimizing strategic safety stock placement 
in supply chains.” Manufacturing Service Operation Managemant, 2 , 68-83. 

Graves, S. C., Rinnooy Kan, A. H. G., & Zipkin, P. H. (Eds.) (1993). Logistics of 
production and inventory. In Handbooks in operations research and management 
science. Amsterdam: North-Holland. 

Hadley, G., & Whitin, T. M. (1963). Analysis of inventory systems. Englewood 
Cliffs, NJ: Prentice-Hall. 

Haimovich, M., & Rinnooy Kan, A. H. G. (1985). Bounds and heuristics for ca- 
pacitated routing problems. Mathematics of Operations Research, 10 , 527-542. 

Haimovich, M., Rinnooy Kan, A. H. G., & Stougie, L. (1988). Analysis of heuristics 
for vehicle routing problems. In B. L. Golden, A. A. Assad (Eds.) Vehicle routing: 
methods and studies (pp. 47-61). New York, NY: Elsevier Science Publishers, 
B.V. 

Hakimi, S. L. (1964) Optimum locations of switching centers and the absolute 
centers and medians of a graph. Mathematics of Operations Research, 12 , 450- 
459. 



References 431 



Hall, N. G. (1988). A multi- item EOQ model with inventory cycle balancing. Naval 
Research Logistics, 35 , 319-325. 

Hariga, M. (1988). The warehouse scheduling problem (Ph. D. Thesis, School of 
Operations Research and Industrial Engineering, Cornell University). 

Harris, F. (1915). Operations and costs. Factory management series (pp. 48-52). 
Chicago: A. W. Shaw Co. 

Hartley, R., & Thomas, L. C. (1982). The deterministic, two-product, inventory 
system with capacity constraint. Journal of the Operational Research Society, 
33 , 1013-1020. 

Hax, A. C., & Candea, D. (1984). Production and inventory management. Engle- 
wood Cliffs, NJ: Prentice-Hall. 

Held, M., & Karp, R. M. (1962). A dynamic programming approach to sequencing 
problems. SIAM Journal on Applied Mathematics, 10 , 196-210. 

Held, M., & Karp, R. M. (1970). The traveling salesman problem and minimum 
spanning trees. Mathematics of Operations Research, 18, 1138-1162. 

Held, M., & Karp, R. M. (1971). The traveling salesman problem and minimum 
spanning trees: part II. Mathematical Programming, 1, 6-25. 

Hodgson, T. J., & Howe, T. J. (1982). Production lot sizing with material-handling 
cost considerations. HE Trans, If, 44-51. 

Hoffman, K. L., & Padberg, M. (1993). Solving airline crew scheduling problems 
by branch-and-cut. Management Science, 39, 657-682. 

Holt, C. C. (1958). Decision rules for allocating inventory to lots and cost founda- 
tions for making aggregate inventory decisions. Journal of Industrial Engineer- 
ing, 9, 14-22. 

Homer, E. D. (1966). Space-limited aggregate inventories with phased deliveries. 
Journal of Industrial Engineering, 17, 327-333. 

Hopp, W., Tekin, E., & Van Oyen, M. (2004). Benefits of skill chaining in serial 
production lines with cross-trained workers. Management Science, 50, 83-98. 

House, R. G., & Jamie, K. D. (1981). Measuring the impact of alternative market 
classification systems in distribution planning. Journal of Business Logistics, 2, 
1-31. 

Hu, P. (2011). Coordinated pricing and inventory management (Ph.D. Disserta- 
tion, University of Illinois at Urbana-Champaign) 

Huh, T., & Janakiraman, G. (2008). (s,S) optimality in joint inventory-pricing 
control: an alternate approach. Mathematics of Operations Research, 56, 783- 
790. 



432 References 



Iglehart, D. (1963a). Optimality of (s, S) policies in the infinite horizon dynamic 
inventory problem. Management Science, 9 , 259-267. 

Iglehart, D. (1963b). Dynamic programming and stationary analysis in inventory 
problems. In H. Scarf, D. Guilford, M. Shelly (Eds.) Multi-stage inventory models 
and techniques (pp. 1-31). Stanford, CA: Stanford University Press. 

Jaillet, P. (1985). Probabilistic traveling salesman problem (Ph.D. Dissertation, 
Operations Research Center, Massachusetts Institute of Technology, Cambridge, 
MA) 

Johnson, D. S., & Papadimitriou, C. H. (1985). Performance guarantees for heuris- 
tics. In E. L. Lawler, J. K. Lenstra, A. H. G. Rinnooy Kan, D. B. Shmoys 
(Eds.) The traveling salesman problem: a guided tour of combinatorial optimiza- 
tion (pp. 145-180). New York: Wiley. 

Johnson, J. C., & Wood, D. F. (1986). Contemporary physical distribution and 
logistics. New York: Macmillan. 

Johnson, D. S., Demers, A., Ullman, J. D., Garey, M. R., & Graham, R. L. (1974). 
Worst-case performance bounds for simple one-dimensional packing algorithms. 
SIAM Journal on Computing, 3 , 299-325. 

Joneja, D. (1990). The joint replenishment problem: new heuristics and worst-case 
performance bounds. Mathematics of Operations Research, 38, 711-723. 

Jones, P. C., & Inman, R. R. (1989). When is the economic lot scheduling problem 
easy? HE Trans, 21, 11-20. 

Jordan, W., & Graves, S. C. (1995). Principles on the benefits of manufacturing 
process flexibility. Management Science, 41 , 577-594. 

Karmarkar, N. (1982). Probabilistic analysis of some bin-packing algorithms. Pro- 
ceedings of 23rd Annual Symposium on Foundations of Computer Science, 107- 

111 . 

Karlin, S., & Taylor, H. M. (1975). A first course in stochastic processes. San 
Diego, CA: Academic. 

Karp, R. M. (1977). Probabilistic analysis of partitioning algorithms for the trav- 
eling salesman problem. Mathematics of Operations Research, 2, 209-224. 

Karp, R. M., & Steele, J. M. (1985). Probabilistic analysis of heuristics. In E. L. 
Lawler, J. K. Lenstra, A. H. G. Rinnooy Kan, D. B. Shmoys (Eds.) The traveling 
salesman problem: a guided tour of combinatorial optimization (pp. 181-205). 
New York: Wiley. 

Karp, R. M., Luby, M., & Marchetti-Spaccamela, A. (1984). A probabilistic analy- 
sis of multidimensional bin packing problems. Proceedings of 16th Annual ACM 
Symposium on Theory of Computing, 289-298. 



References 433 



Kimes, S. E. (1989). A tool for capacity-constrained service firms. Journal of Op- 
erations Management, 8( 4), 348-363. 

Kingman, J. F. C. (1976). Subadditive processes. Lecture Notes in Mathematics 
539 , 168-222. Berlin: Springer. 

Klincewicz, J. G., & Luss, H. (1986). A lagrangian relaxation heuristic for capaci- 
tated facility location with single-source constraints. Journal of the Operational 
Research Society, 37, 495-500. 

Kuehn, A. A., & Hamburger, M. J. (1963). A heuristic program for location ware- 
houses. Management Science, 9 , 643-666. 

Lau, H. S. (1980) The newsboy problem under alternative optimization objectives. 
Journal of the Operational Research Society, 31, 525-535. 

Lawler, E. L. (1976). Combinatorial optimization: networks and matroids. New 
York: Holt, Rinehart and Winston. 

Lawler, E. L., Lenstra, J. K., Rinnooy Kan, A. H. G., & Shmoys, D. B. (Eds.) 
(1985). The traveling salesman problem: a guided tour of combinatorial opti- 
mization. New York: Wiley. 

Lawler, E. L., Lenstra, J. K., Rinnooy Kan, A. H. G., & Shmoys, D. B. (1993). 
Sequencing and scheduling: algorithms and complexity. In S. C. Graves, A. H. 
G. Rinnooy Kan, P. H. Zipkin (Eds.) Handbooks in operations research and 
management science, the Volume on Logistics of production and inventory (pp. 
445-522). Amsterdam: North-Holland. 

Lee, H. L., & Nahmias, S. (1993). Single product, single location models. In S. 
C. Graves, A. H. G. Rinnooy Kan, P. H. Zipkin (Eds.) Handbooks in operations 
research and management science, the Volume on Logistics of production and 
inventory (pp. 3-55). Amsterdam: North-Holland. 

Li, C. L., & Simchi-Levi, D. (1990). Worst-case analysis of heuristics for the multi- 
depot capacitated vehicle routing problems. ORSA Journal on Computing, 2, 
64-73. 

Lindsey (1996). A communication to the AGIS-L list server. 

Lovasz, L. (1979). Combinatorial problems and exercises. Amsterdam: North- 
Holland. 

Love, S. F. (1973). Bounded production and inventory models with piecewise con- 
cave costs. Management Science, 20, 313-318. 

Magnanti, T. L., Shen, Z-J. M., Shu, J., Simchi-Levi, D., & Teo, C-P. (2003). 
Inventory placement in acyclic supply chain networks. Working Paper. Operation 
Research Letter, 34, 228-238. 



434 References 



Manne, A. S. (1964). Plant location under economies of scale — decentralization 
and computation. Management Science, 11, 213-235. 

Martinez-de-Albeniz, V., & Simchi-Levi, D. (2005). A portfolio approach to pro- 
curement contracts. Production and Operations Management, 1 4, 90-114. 

Martinez-de-Albeniz V., & Simchi-Levi, D. (2006). Mean- variance trde-offs in sup- 
ply contracts. Naval Research Logistics, 53, 603-616. 

Maxwell, W. L., & Singh, H. (1983). The effect of restricting cycle times in the 
economic lot scheduling problem. HE Trans, 15, 235-241. 

Melkote, S. (1996). Integrated models of facility location and network design 
(Ph.D. thesis, Northwestern University). 

Mirchandani, P. B., & Francis, R. L. (1990). Discrete location theory. New York: 
Wiley. 

Muckstadt, J. M., & Roundy, R. O. (1993). Analysis of multistage production 
systems. In S. C. Graves, A. H. G. Rinnooy Kan, P. H. Zipkin (Eds.) Hand- 
books in operations research and management science , the volume on Logistics 
of production and inventory (pp. 59-131). Amsterdam: North-Holland. 

Murota, K. (2003). Discrete convex analysis. Philadelphia: Society for Industrial 
and Applied Mathematics. 

Muriel, A., & Simchi-Levi, D. (2003). Supply chain design and planning - ap- 
plications of optimization techniques for strategic and tactical models. In de. 
Kok, S. Graves (Eds.) Handbooks in operations research and management sci- 
ence (Vol. 11): Supply chain management: design, coordiation and operation. 
Boston: Elsevier. 

Murota, K., & Shioura, A. (2004). Conjugacy relationship between M-convex and 
L-convex functions in continuous variables. Mathematical Programming, 101, 
415-433. 

Murota, K., & Shioura, A. (2005). Substitutes and complements in network flows 
viewed as discrete convexity. Discrete Mathematics, 2, 256-268. 

Myerson, R. B. (1997). Game theory: Analysis of Conflict. Cambridge, MA: Har- 
vard University Press. 

Nagarajan, M., & Sosic, G. (2008). Game-theoretic analysis of cooperation among 
supply chain agents: review and extensions. European Journal of Operational 
Research, 187(3), 719-745. 

Nauss, R. M. (1976). An efficient algorithm for the 0-1 knapsack problem. Man- 
agement Science, 23, 27-31. 



References 435 



Neebe, A. W., & Rao, M. R. (1983). An Algorithm for the fixed-charged assigning 
users to sources problem. Journal of the Operational Research Society , 34, 1107- 
1113. 

Newton, R. M., & Thomas, W. H. (1969). Design of school bus routes by computer. 
Socio-Economic Planning Science, 3 , 75-85. 

Osborne, M. J. (2003). An introduction to game theory. New York, Oxford: Oxford 
University Press. 

Ozer, O., & Phillips, R. (Eds.) (2012). The oxford handbook of pricing management. 
New York, Oxford: OUP Oxford. 

Page, E., & Paul, R. J. (1976). Multi-product inventory situations with one re- 
striction. Journal of the Operational Research Society, 27, 815-834. 

Pang, Z. (2011). Optimal dynamic pricing and inventory control with stock dete- 
rioration and partial backordering. Operation Research Letter, 39, 375-379. 

Pang, Z., Chen, Y. F., & Feng, Y. (2012). A note on the structure of joint inventory- 
pricing control with leadtimes. Operation Research, 60, 581-587. 

Papadimitriou, C. H., & Stieglitz, K. (1982). Combinatorial optimization: algo- 
rithms and complexity. Englewood Cliffs, NJ: Prentice-Hall. 

Park, K. S., & Yun, D. K. (1985). Optimal scheduling of periodic activities. Op- 
eration Research, 33, 690-695. 

Patton, E. P. (1994). Carrier rates and tariffs. In J. A. Tompkins, D. Harmelink 
(Eds.) The distribution management handbook (Chapter 12). New York: 
McGraw-Hill. 

Peleg, B., & P. Sudholter (2007). Introduction to the theory of cooperative games 
(2nd ed.). Berlin: Springer. 

Petruzzi, N. C., & Dada, M. (1999). Pricing and the newsvendor model: a review 
with extensions. Operation Research, 47, 183-194. 

Pinedo, M. (1995). Scheduling: theory, algorithms and systems. Englewood Cliffs, 
NJ: Prentice-Hall. 

Pirkul, H. (1987). Efficient algorithms for the capacitated concentrator location 
problem. Computers and Operations Research, 14 , 197-208. 

Pirkul, H., & Jayaraman, V. (1996). Production, transportation and distribution 
planning in a multi- commodity tri-echelon system. Transportation Science, 30, 
291-302. 

Polyak, B. T. (1967). A general method for solving extremum problems (in Rus- 
sian). Doklady Akademmi Nauk SSSR, 174 ? 33-36. 



436 References 



Porteus, E. L. (1985). Investing in reduced setups in the EOQ model. Management 
Science, 31, 998-1010. 

Porteus, E. L. (1990). Stochastic inventory theory. In D. P. Heyman, M. J. Sobel 
(Eds.) Handbooks in operations research and management science, the volume 
on Stochastic models (pp. 605-652). Amsterdam: North- Holland. 

Psaraftis, H. N. (1984). On the practical importance of asymptotic optimality in 
certain heuristic algorithms. Networks, 14, 587-596. 

Rhee, W. T. (1988). Optimal bin packing with items of random sizes. Mathematics 
of Operations Research, 13, 140-151. 

Rhee, W. T. (1991). An asymptotic analysis of capacitated vehicle routing. Work- 
ing Paper, The Ohio State University, Columbus, OH. 

Rhee, W. T., & Talagrand, M. (1987). Martingale inequalities and NP-complete 
problems. Mathematics of Operations Research, 12, 177-181. 

Robeson, J. F., & Copacino, W. C. (Eds.) (1994). The logistics handbook. New 
York: Free Press. 

Rockafellar, R. T. (1970). Convex analysis. Princeton, NJ: Princeton University 
Press. 

Rosen, J. B. (1965). Existence and uniqueness of equilibrium points for concave 
7V-person games. Econometrica, 33, 520-534. 

Ross, Sheldon M. (1983). Introduction to Stochastic Dynamic Programming. Aca- 
demic Press, INC., London. 

Rosenblatt, M., & Rothblum, U. (1990). On the single resource capacity problem 
for multi- item inventory systems. Operation Research, 38, 686-693. 

Rosenkrantz, D. J., Stearns, R. E., & Lewis II, P. M. (1977). An analysis of several 
heuristics for the traveling salesman problem. SIAM Journal on Computing, 6, 
563-581. 

Ross, S. (1970). Applied Probability models with optimization applications. San 
Francisco: Holden-Day. 

Rosling, K. (1989). Optimal inventory policies for assembly systems under random 
demand. Operation Research, 37, 565-579. 

Roundy, R. (1985). 98%-effective integer-ratio lot-sizing for one- warehouse multi- 
retailer systems. Management Science, 31, 1416-1430. 

Russell, R. A. (1977). An effective heuristic for the M-tour traveling salesman 
problem with some side constraints. Operation Research, 25, 521-524. 



References 437 



Sahni, S., & Gonzalez, T. (1976). P-complete approximation algorithms. Journal 
of the Association for Computing Machinery , 23 , 555-565. 

Scarf, H. E. (1960). The optimalities of (s,S) policies in the dynamic inventory 
problem. In K. Arrow, S. Karlin, P. Suppes (Eds.) Mathematical methods in the 
social sciences (pp. 196-202). Stanford, CA: Stanford University Press. 

Schweitzer, M., & Chachon, G. (2000). Decision bias in the newsvendor problem 
with a known demand distribution: experimental evidence. Management Sci- 
ence, 46(3), 404-420. 

Shapley, L. (1967), On balanced sets and cores. Naval Research Logistics Quarterly, 
14, 453-460. 

Shmoys, D., & Williamson, D. (1990). Analyzing the held-karp TSP bound: a 
monotonicity property with application. Information Processing Letters, 35, 
281-285. 

Silver, E. A. (1976). A simple method of determining order quantities in joint 
replenishments under deterministic demand. Management Science, 22, 1351- 
1361. 

Silver, E. A., & Meal, H. C. (1973). A heuristic for selecting lot size quantities for 
the case of a deterministic time- varying demand rate and discrete opportunities 
for replenishment. Production and Inventory Management, 14 , 64-74. 

Silver, E. A., & Peterson, R. (1985). Decision systems for inventory management 
and production planning. New York: Wiley. 

Simchi-Levi, D. (1994), New worst case results for the bin-packing problem. Naval 
Research Logistics, 41, 579-585. 

Simchi-Levi, D. (2010). Operations rules: delivering customer value through flexible 
operations. Cambridge, MA: MIT Press. 

Simchi-Levi, D., & Bramel, J. (1990). On the optimal solution value of the capac- 
itated vehicle routing problem with unsplit demands. Working Paper, Depart- 
ment of IE&OR, Columbia University, New York. 

Simchi-Levi, D., Kaminsky, P., & Simchi-Levi, E. (2003). Designing and managing 
the supply chain (2nd ed.). Burr Ridge, IL: McGraw-Hill. 

Simchi-Levi, D., Kaminsky, P., & Simchi-Levi, E. (2007). Designing and managing 
the supply chain (3rd ed.). concepts, Strategies and Case Studies. McGraw-Hill. 

Simchi-Levi, D., Wu, D., & Shen, Z. J. (Eds). (2004). Handbook of quantitative 
supply chain analysis: modeling in the E-business era. New York: Springer. 

Simchi-Levi, D., Wei, Y. (2012) Understanding the Performance of the Long Chain 
and Sparse Designs in Process Flexibility. Operations Research, 60(5), 1125-1141 



438 References 



Solomon, M. M. (1986). On the worst-case performance of some heuristics for the 
vehicle routing and scheduling problem with time window constraints. Networks, 
16 , 161-174. 

Solomon, M. M., & Desrosiers, J. (1988). Time window constrained routing and 
scheduling problems: a survey. Transportation Science, 22 , 1-13. 

Stankevich, D. (1996). Ace of diamonds. Discount Merchandiser, 38(8), 28-37. 

Steele, J. M. (1981). Subadditive euclidean functionals and nonlinear growth geo- 
metric probability. Annals of Probability, 9 , 365-375. 

Steele, J. M. (1990). Lecture notes on “probabilistic analysis of algorithms.” 

Stout, W. F. (1974). Almost sure convergence. New York: Academic. 

Strassen, V. (1969). Gaussian elimination is not optimal. Nmerische Mathematik, 
13, 354-356. 

Swersey, A. J., & Ballard, W. (1984). Scheduling school buses. Management Sci- 
ence, 30, 844-853. 

Talluri, K., & van Ryzin, G. (2004). The theory and practice of revenue manage- 
ment. New York: Springer. 

Tarski, A. (1955). A lattice-theorectical fixpoint theorem and its applications. 
Pacific Journal of Mathematics, 5, 285-309. 

Thomas, L. C., & Hartley, R. (1983). An algorithm for limited capacity inventory 
problem with staggering. Journal of the Operational Research Society, 34, 81-85. 

Topkis, D. M. (1998). Supermodularity and complementarity. Princeton, NJ: 
Princeton University Press. 

van Ryzin, G. (2012). Models of demand. In P. Philips, O. Ozer (Eds.) Oxford hand- 
book of pricing management (pp. 340-380). Oxford: Oxford University Press. 

Veinott, A. (1966). On the optimality of (s,S) inventory policies: new condition 
and a new proof. Journal SIAM Applied Mathematics, 14, 1067-1083. 

Veinott, A., & Wagner, H. (1965). Computing optimal (s,S) inventory policies. 
Management Science, 11, 525-552. 

Vives, X. (2000). Oligopoly pricing: old ideas and new tools. Cambridge, MA: MIT 
Press. 

Wagner, H. M., & Whitin, T. M. (1958a). Dynamic problems in the theory of the 
firm. Naval Research Logistics Quarterly, 5, 53-74. 

Wagner, H. M., & Whitin, T. M. (1958b). Dynamic version of the economic lot 
size model. Management Science, 5, 89-96. 



References 439 



Wagelmans, A., Van Hoesel, S., & Kolen, A. (1992). Economic lot sizing: an 
0(n log n) algorithm that runs in linear time in the Wagner-Whitin case. Oper- 
ation Research, ^5(Suppl 1), S145-S156. 

Weber, A. (1909). In C. J. Friedrich (Ed. and transl.) Theory of the location of 
industries. Chicago: Chicago University Press. 

Whitin, T. M. (1955). Inventory control and price theory. Management Science, 
2 , , 61-80. 

Wolsey, L. (1980). Heuristic analysis, linear programming and branch and bound. 
Mathematical Programming Study, IS , 121-134. 

Yano, C., & Gilbert, S. (2002). Coordinated pricing and production/procurement 
decisions: a review. In A. Chakravarty, J. Eliashberg (Eds.) Managing busi- 
ness interfaces: marketing, engineering and manufacturing perspectives. Boston: 
Kluwer Academic. 

Ye, Q., & Duenyas, I. (2007). Optimal capacity investment decisions with two-sided 
fixed- capacity adjustment costs. Operation Research, 55, 272-283. 

Yellow, P. (1970). A computational modification to the savings method of vehicle 
scheduling. Operational Research Quarterly, 21 , 281-283. 

Zangwill, W. I. (1966). A deterministic multi-period production scheduling model 
with backlogging. Management Science, 13(1), 105-199. 

Zavi, A. (1976). Introduction to operations research, part II: dynamic programming 
and inventory theory (in Hebrew). Tel- Aviv: Dekel. 

Zheng, Y. S. (1991). A simple proof for the optimality of (s, S) policies for infinite 
horizon inventory problems. Journal of Applied Probability, 28, 802-810. 

Zheng, Y. S., & Federgruen, A. (1991). Finding optimal (s,S) policies is about as 
simple as evaluating a single policy. Operation Research, 39, 654-665. 

Zipkin, P. H. (2000). Foundations of inventory management. Burr Ridge, IL: Irwin. 

Zipkin, P. H. (2008). On the structure of lost-sales inventory models. Operation 
Research, 56, 937-944. 

Zoller, K. (1977). Deterministic multi- item inventory systems with limited capac- 
ity. Management Science, 24, 451-455. 



Index 



(s,S) policy, 153 

lA -convex function, 36, 40, 170 

L^-convex set, 36, 37 

-convex function, 39, 40 
-convex set, 39 
e-core, 59 

&-opt procedures, 80 
k— connected, 110 

p -median problem, 283-290, 293, 298 
(strictly) increasing differences, 28 
Ozer, 6., 11 

1- tree, 106 

2- partition problem, 67 

absolute performance ratio, 65 
additivity, 54, 62 
Aggarwal, A., 140 
aggregate monotonicity, 55 
Agrawal, V., 202 
Aho, A. V., 74 
Altinkemer, K., 303, 317 
Angel, R. D., 406 
Anily, S., 83, 123 
anonymity, 54, 55, 61-63 



Archibald, B., 163 
Arkin, E., 147 
Arrow, K., 151 

assembly-distribution system, 173 
asymptotic performance ratio, 66 
asymptotically optimal, 88 
Atkins, D. R., 147 
average-case analysis, 10 
Azuma, K., 89 

Basar, T., 45 

Baker, B. S., 67 

Baker, K. R., 143 

Balakrishnan, A., 266, 274, 275 

balanced, 56 

Balinski, M. L., 284, 359 

Ball, M. O., 11 

Ballard, W., 407 

Barcelo, J., 289 

base planning period, 121 

base stock policy, 153 

Beardwood, J., 92, 331 

Beasley, J., 305, 315 

Bell, C., 163 



D. Simchi-Levi et al., The Logic of Logistics: Theory, Algorithms, and Applications 441 

for Logistics Management , Springer Series in Operations Research and Financial Engineering, 
DOI 10.1007/978-1-4614-9149-1, © Springer Science+Business Media New York 2014 



442 Index 



Bennett, B., 406, 417 
Berman, L., 407, 413 
Bernstein, F., 213 
Bertsekas, D. P., 220 
Bertsimas, D. J., 109, 313 
best-fit, 66 

best-fit decreasing, 66 
Bienstock, D., 110, 326 
bin-packing constant, 296, 320 
bin-packing problem, 66, 86, 100, 297, 
320, 332, 373 
Bodin, L., 407, 413 
Bondareve, O., 56 
Borel-Cantelli lemma, 331 
Bouakiz, M, 202 

Bramel, J., 316, 320, 335-337, 341, 367, 
368, 413 

branch and bound, 364 

Cachon, G., 202, 213, 230 
Candea, D., 3 

capacitated concentrator location 
problem, 288 

capacitated facility location problem, 
335, 349 

capacitated vehicle location problem 
with time windows, 349 
capacitated vehicle routing problem, 
301 

Casanovas, J., 289 
Chan, L. M. A., 103, 177, 269, 272 
Chandra, P., 80 
Chapleau, L., 413 
Chen, F., 151, 172, 173, 202 
Chen, X., 44, 150, 177, 178, 192, 193, 
200-202, 208, 209, 222 
Chen, Y. F., 159, 201 
Chen. X., 207 
Chou, M., 258, 261, 262 
Christofides’ heuristic, 78, 81, 82, 111, 
305, 312 

Christofides, N., 78, 313, 316, 363 
Chua, G., 258, 261, 262 
Churchman, C. W., 123 
circular region partitioning, 309, 310 



Clark, A. J., 151 
Clarke, G., 314, 406 
clique, 365 

cluster-first-route-second, 316 
coalitional form, 52 
coalitional monotonicity, 55, 62 
Coffman, E. G., Jr., 85, 98, 337 
column-generation, 360 
common knowledge, 46 
consecutive heuristic, 327 
consistency, 55, 56, 61, 63 
convex extendable, 40 
convex game, 53, 57 
convexity, 8, 15 
cooperative game, 52 
core, 55, 62, 217 
Cornuejols, G., 284, 317 
Council of Supply Chain Management 
Professionals, 2 

covariance under strategic equivalence, 
54, 55, 61, 63 

crew scheduling problems, 364 
Croxton, K. L., 269 
Cullen, F., 360 
cutting-plane methods, 317 
cycle time, 118 

Dada, M., 177 
Daskin, M. S., 284 
De Kok, 11 

delivery man problem, 356 

Dematteis, J. J., 139 

Denardo, E. V., 163 

Deng, Q., 336 

Deng, S., 150 

depth-first search, 74 

Desrochers, M., 354, 360, 363, 367 

Desrosiers, J., 341, 407 

diagonally dominated M- matrix, 37 

discrete convex analysis, 35 

Dobson, G., 124 

double marginalization, 232 

Dreyfus, S. E., 98 

dual-feasible, 112 

Duenyas, I., 207 



Index 443 



dynamic program, 98, 374 

dynamic programming, 93, 98, 363, 374 

echelon holding cost, 131 
echelon inventory position, 172 
economic lot scheduling problem, 124 
economic lot size model, 117, 134, 135 
economic order quantity (EOQ), 119, 
135 

economic warehouse lot scheduling 
problem, 123 
Edmonds, J., 108, 112 
Eeckhoudt, L., 202 
efficiency, 54, 55, 61-63 
Eliashberg, G., 177 
Elmaghraby, W., 177 
Eppen, G., 151, 172 
equidistance convexity, 42 
Erlenkotter, D., 117 
Eulerian graph, 78, 94 
Euler ian tour, 78 

Federgruen, A., 139, 140, 151, 163, 172, 
177, 202, 213, 313, 341 
Feng, Y., 201 
Few, L., 91 
first-fit, 66 

first-fit decreasing, 66 
Fisher, M. L., 9, 10, 103, 313, 316, 317, 
337 

Florian, M., 143 
Francis, R. L., 284, 297, 298 
Fudenberg, D., 45 

Gallego, G., 124, 177 
game theory, 8 
Garey, M. R., 9, 66, 68 
Gaskel, T. J., 315 
Gavish, B., 303, 317 
Gazis, D., 406, 417 
Gendron, B., 269 

generalized assignment heuristic, 316, 
337 

geocoding, 409 

geographic information system, 409 



geographic information system (GIS), 
405 

Geunes, J., 150 
Ghosh, A., 217 
Gilbert, S., 177 
Gillett, B. E., 315 
global minimizer, 40 
Goemans, M. X., 109 
Golden, B. L., 81 
Gollier, C., 202 
Gonzalez, T., 72 
Goyal, S. K., 123 
Graves, S. C, 275 

Graves, S. C., 11, 130, 266, 274, 275, 
280 

greedy algorithm, 57 
group rationality, 54, 55 

Hadley, G., 123, 128 

Haimovich, M., 303, 307, 308, 317 

Hakimi, S. L., 284 

Hall, N. G., 123 

Hamburger, M. J., 284 

Hamiltonian cycle problem, 72 

Hamiltonian path, 81 

Harche, F., 317 

Hariga, M., 124 

harmonic heuristic, 97 

Harris, F., 117 

Harris, T., 151 

Hartley, R., 123 

Hax, A. C., 3 

He, S., 44 

Heching, A., 177 

Held, M., 93, 103, 105, 107, 108 

Hodgson, T. J., 123 

Hoffman, K. L., 360, 364 

Holt, C. C., 123 

Homer, E. D., 123 

Howe, T. J., 123 

Hu, P., 44, 201 

Huh, T., 195 

Iglehart, D., 151, 163 

incomplete optimization methods, 317 



444 Index 



increasing generalized failure rate 
(IGFR), 232 

increasing set function, 33 
independent set, 365 
independent solutions, 123 
individual rationality, 54, 55, 60 
infimal convolution, 40 
Inman, R. R., 124 
integrality property, 105 
intersection graph, 365 
inventory centralization game, 216, 
217, 219, 222, 228 
inventory decomposition property, 

143 

inventory position, 169 
inventory-balance constraints, 138 
iterated tour-partitioning, 303 
Iyogun, P., 147 

J. Rosen, 52 
Jaikumar, R., 316, 337 
Jaillet, P., 91 
Janakiraman, G., 195 
Jayaraman, V., 292 
Jensen’s inequality, 18, 22, 181 
Johnson, D. S., 9, 66-68, 72, 74 
join, 27 

joint setup cost, 147 
Joneja, D., 147 
Jones, P. C., 124 

K- convexity, 155, 175 
Kakutani fixed-point theorem, 48 
Karlin, S., 89 
Karmarkar, N., 321 
Karp, R. M., 92, 93, 97, 103, 105, 107, 
108 

Keskinocak, P., 177 
Kimes, S. E., 177 
Kingman, J. F. C., 86 
Klein, M., 143 
Klincewicz, J. G., 289 
knapsack problem, 289, 295 
Kuehn, A. A., 284 



Lagrangian dual, 104 
Lagrangian relaxation, 103, 105, 284, 
285, 289, 293, 336, 351 
laminar convex function, 39, 40 
laminar family, 40 
lattice, 27 
Lau, H. S., 202 
Law, A. M., 98 

Lawler, E. L., 72, 78, 333, 342 

layered graph, 366 

least-core, 59 

Lee, H. L., 151 

Leuker, G. S., 98 

Li, C. L., 305, 319 

local minimizer, 40 

location-based heuristic, 316 

log-supermodular, 214 

loss function, 154 

Lovasz, L., 110 

Love, S. F., 143 

Lueker, G. S., 85, 337 

Luss, H., 289 

machine scheduling problem, 342 
Magnanti, T. L., 269, 280 
Manne, A. S., 284 
Marschak, J., 151 
Martinez-de-Albeniz, V., 202, 231 
martingale inequalities, 89 
MATCH heuristic, 89 
matching, 78 
meet, 27 
Melkote, S., 274 
midpoint convexity, 42 
Miller, L. R., 315 
minimum K-tree methods, 317 
minimum spanning tree-based 
heuristic, 74 

Mirchandani, P. B., 284, 297, 298 
monotone optimal solutions, 38 
Muckstadt, J. M., 121 
Muriel, A., 222, 269, 272 
Murota, K., 15, 43, 262 
Myerson, R., 45 



Index 445 



Nagarajan, M., 213 
Nahmias, S., 151 

Nash equilibrium, 35, 47, 215, 230 
Nauss, R. M., 289 
nearest-insertion heuristic, 75, 76 
nearest-neighbor heuristic, 75, 356 
Neebe, A. W., 289 
nested, 130 
Netessine, S., 213 
network design, 109 
newsboy problem, 152 
newsvendor game, 228 
newsvendor problem, 152, 177, 215, 
230-232 

Newton, R. M., 406 
next-fit, 83, 326 
next-fit increasing, 83 
node cover problem, 298 
non-K-decreasing, 160, 175 
nontransferrable utilities, 53 
NP-complete, 67, 72, 99, 143, 147 
NP-hard, 9, 65, 298, 336, 352, 362 
nucleolus, 59 
null player, 54, 55, 61, 62 

odd hole, 366 

offline algorithms, 66 

Olsder, G., 45 

online algorithms, 66 

optimal matching of pairs, 332, 333 

optimal partitioning, 305, 319 

optimal partitioning heuristic, 315 

order-up-to- level, 153 

Osborne, M., 45 

Padberg, M., 360, 364 

Page, E., 123 

Pan, L., 201 

Pang, Z., 175, 201 

Papadimitriou, C. H., 72, 74, 81, 82 

Park, J. K., 140 

Park, K. S., 123 

parsimonious property, 109 

part-period balancing heuristic, 139 



Paul, R. J., 123 
Peleg, B., 45 
perfect packing, 321 
Peterson, E., 123 
Petruzzi, N. C., 177 
Phillips, R., 11 

pickup and delivery problem, 357 
Pinedo, M., 342 
Pirkul, H., 289, 292, 335 
planning horizon, 137 
polar region partitioning, 309, 310 
Polyak, B. T., 104 
Porteus, E. L., 136, 151, 163 
power-of-two policies, 121 
prize-collecting traveling salesman 
problem, 372 
probabilistic analysis, 85 
production sequence, 144 
properties of Z^-convex functions, 37 
Psaraftis, H. N., 337 

quadratic Z^-convex function, 37 
quadratic M^-convex function, 40 
Quandt, R. E., 359 
quasiconvex, 18, 42, 159, 163, 195 

Rao, M. R., 289 
rate of convergence, 337, 372 
rectangular region partitioning, 309, 
310 

regeneration point, 144, 163 

region partitioning, 93, 324 

reorder point, 153 

Rhee, W. T., 89, 321, 337 

Rinnooy Kan, A. H. G., 303, 307, 308 

Rockafellar, T., 15, 22 

Rosenblatt, M., 123 

Rosenkrantz, D. J., 75, 77 

Rosling, K., 151 

Ross, S., 163, 168 

rotation cycle policies, 123 

Rothblum, U., 123 

Roundy, R., 121, 130 

route- first-cluster-second, 315 

Russell, R. A., 315 



446 Index 



Sahni, S., 72 
savings, 314 

savings algorithm, 314, 406 

Scarf, H. E., 151, 155, 169 

Schlesinger, H., 202 

Schrage, L., 151, 172 

Schwarz, L. B., 130 

Schweitzer, M., 202 

seed customers, 317, 337, 350 

seed-insertion heuristic, 353 

Seshadri, S., 202 

set-partitioning, 359 

set-partitioning problem, 100 

Shapley value, 61 

Shapley, L., 56 

Shen, M., 177, 280 

Shioura, A., 43, 262 

Shmoys, D., 109 

shortcut, 74 

Shu, J., 280 

Shum, S., 201 

Silver, E. A., 123, 147, 163 

Silver-Meal heuristic, 139, 147 

Silver-Meal heuristic, 150 

Sim, M., 178, 202 

Simchi-Levi, D., 7, 11, 67, 110, 150, 

177, 178, 192, 193, 200-202, 
208, 222, 231, 269, 272, 280, 
305, 313, 316, 319, 320, 
335-337, 341, 367, 368 
Simchi-LevilO, 7 

sliced interval partitioning heuristic, 87 

sliced region-partitioning heuristic, 333 

Sosic, G., 213 

Sobel, M. J., 202 

Solomon, M. M., 341, 353 

solution concept, 53 

spanning tree, 73 

splittable demands, 302 

Stackelberg game, 46, 231 

staggering problem, 123 

Stankevich, D., 217 

star-tours heuristic, 353 

stationary, 130 

stationary order sizes and intervals, 124 



Steele, J. M., 86, 92, 97 

Steiglitz, K., 74, 81, 82 

Steinberg, R., 177 

Stewart, W. R., 81 

Stout, W. F., 89 

Strassen. V., 256 

strategic form, 52 

strong monotonicity, 55, 62 

subadditive processes, 86, 327 

subgradient optimization, 104, 285, 295 

submodular, 36 

subtour elimination, 107 

Sudholter, P., 45 

Sun, P., 178, 202, 207 

superadditive game, 53 

superadditivity, 54 

super addivity, 55 

supermodular, 214 

supermodular game, 35, 49, 53, 215 

supermodularity, 8, 15, 27, 194 

Swann, J., 177 

sweep algorithm, 315, 316, 326 
Swersey, A. J., 407 
symmetry, 54, 61, 62 
system inventory, 131 

Talagrand, M., 89 
Talluri, K., 178 

Tarski fixed-point theorem, 35, 49 

Tarski, A., 35 

Taylor, H. M., 89 

Teo, C-R, 280 

Teo, C., 258, 261, 262 

the Bondareva-Shapley theorem, 56 

Thomas, L. C., 123 

Thomas, W. H., 406 

time window, 341 

Tirole, J., 45 

Topkis, D., 15 

transferrable utilities, 53 

translation submodular, 36 

traveling salesman problem, 72, 91, 105 

triangle inequality, 73 

two-phase method, 316 

Tzur, M., 139, 140 



Index 447 



unequal- weight iterated tour 
partitioning, 317 
unimodal, 159 
unsplit demands, 313 

van Ryzin, G., 177, 178, 341 
vehicle routing problem, 301 
vehicle routing problem with distance 
constraints, 356 

vehicle routing problem with time 
windows, 341 

Veinott, A., 151, 159, 161, 163 
Vives X., 45 



Wagelmans, A., 140 
Wagner, H., 138, 150, 151, 163 
wandering salesman problem, 82 
weak consistency, 55 
Weber, A., 284 

Whitin, T. M., 123, 128, 138, 150, 177 
Willems, S., 280 



Williamson, D., 109 
Wolsey, L., 109 
worst-case analysis, 9 
worst-case effectiveness, 65 
Wright, J. W., 314, 406 

Yano, C., 150, 177 
Ye, Q., 207 
Yellow, P., 315 
Yun, D. K., 123 

Zangwill, W. I., 139 
Zavi, A., 135 

zero-inventory-ordering property, 118, 
119, 130, 131, 138 
Zhang, J., 222 
Zhang, Y., 201, 209 
Zheng, H., 258, 261, 262 
Zheng, Y. S., 151, 163, 172, 173 
Zhou, S., 201, 209 
Zipkin, P. H., 151, 170, 172 
Zoller, K., 123 



